EP3289763A1 - Method for analysing a video sequence and equipment for implementing said method - Google Patents
Method for analysing a video sequence and equipment for implementing said methodInfo
- Publication number
- EP3289763A1 EP3289763A1 EP16721199.4A EP16721199A EP3289763A1 EP 3289763 A1 EP3289763 A1 EP 3289763A1 EP 16721199 A EP16721199 A EP 16721199A EP 3289763 A1 EP3289763 A1 EP 3289763A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- video sequence
- sequence
- analysis
- subsequences
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 238000000034 method Methods 0.000 title claims abstract description 87
- 238000012545 processing Methods 0.000 claims abstract description 74
- 238000004458 analytical method Methods 0.000 claims description 133
- 238000007906 compression Methods 0.000 claims description 27
- 230000006835 compression Effects 0.000 claims description 26
- 238000013213 extrapolation Methods 0.000 claims description 16
- 238000011282 treatment Methods 0.000 claims description 14
- 238000003860 storage Methods 0.000 claims description 11
- 238000012300 Sequence Analysis Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 5
- 238000005070 sampling Methods 0.000 description 22
- 230000002123 temporal effect Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000001914 filtration Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 238000010191 image analysis Methods 0.000 description 4
- 238000005259 measurement Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000013500 data storage Methods 0.000 description 3
- 238000007796 conventional method Methods 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000010923 batch production Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/142—Detection of scene cut or scene change
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/177—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/192—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
- H04N19/194—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive involving only two passes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
Definitions
- the present invention relates to a method for analyzing video sequences and a device for implementing this method. It applies in particular to the analysis of a video sequence for processing (video coding, compression, denoising, etc.) to be performed on the sequence.
- the video data is generally subject to source coding to compress them in order to limit the resources required for their transmission and / or storage.
- source coding There are many coding standards, such as H.264 / AVC, H.265 / HEVC and MPEG-2, that can be used for this purpose.
- a video sequence comprising a set of images is considered.
- Many automatic processes on video sequences require prior analysis. This is the case, for example, of two-pass variable rate compression where a first pass corresponding to a prior analysis phase mainly makes it possible to determine the complexity of the sequence before encoding it properly in a corresponding second pass. at a treatment phase.
- the operations of the analysis phase are often as complex as those of the subsequent treatment phase. Therefore, the overall time of treatment with prior analysis is found to be significantly higher than that of a treatment without prior analysis.
- the duration of the prior analysis phase thus plays a preponderant role for the determination of the total duration necessary for the encoding, in other words of the processing speed of the processed sequence. It is therefore desirable to reduce as much as possible the duration of the prior analysis phase to achieve high processing speeds that are compatible with the requirements of video catalog processing containing for example several thousand films.
- the processing first-pass analysis may also consist of a video encoding which makes it possible to extract for each image or subsequence of images a relationship between the video quality (taking into account the compression distortion) and the bit rate obtained after compression.
- the second pass all these relationships are used by the encoder to regulate the flow optimally.
- the prior analysis phase generally makes it possible to extract the characteristics of the noise in order to guide the denoising that is carried out during the second phase.
- a method of estimating the noise characteristics consists in removing it and then measuring statistics of the signal from which the noise has been extracted to determine a difference between the noisy signal and the signal after extraction of the noise characterizing the noise extracted. This operation generally presents the complexity of a complete denoising, and as in the case of video compression, almost doubles the processing time compared to a treatment done without prior analysis.
- the document US 2010/0027622 proposes to reduce the spatial and / or temporal resolution of a video stream prior to a first encoding pass in order to reduce the calculation time in the particular case of a two-pass video encoding. .
- Reducing the spatial resolution involves decreasing the size of the analyzed images, linearly or by selecting a portion of each image.
- the statistics extracted in the first pass are then extrapolated to obtain an estimate of what would have been achieved if the entire images had been analyzed.
- the image flow is reduced in a regular or irregular manner and only the stored images are analyzed during the first pass. As before, the statistics of images that are not analyzed are extrapolated.
- Both methods can be combined, thus reducing the analysis time even further. Both are based on the idea that it is possible to extrapolate missing data from the analyzed data.
- An object of the present invention is to provide an improved method of analysis of a video sequence as part of a multi-pass processing.
- a method of analyzing a set of images of a video sequence for a processing to be performed on the sequence comprising determining in the video sequence a plurality consecutive subsequences disjoint from one or more successive images, and analyzing the images of each subsequence determined in the video sequence, wherein the subsequences are determined according to the type of processing to be performed and according to the content of the video sequence.
- the images are not spatially homogeneous. Their center for example, which is the point of interest privileged, does not have the same complexity as their edges.
- the analysis results which may include statistical data extracted after spatial subsampling, often have only a distant relationship with the results of an analysis performed on the initial images.
- a typical example is noise, which changes characteristics when the size of the images is reduced.
- the temporal subsampling also poses the problem of homogeneity of the content which makes the extrapolation of the statistics difficult. Indeed, all the treatments based on the temporal coherence of the video sequences lose in effectiveness or even become inapplicable when the sequences are subsampled temporally. This is the case, for example, of the motion estimation of a video compressor which loses in precision as the time distance between the images increases.
- the proposed method has the advantage of favoring temporal sub-sampling, in order to avoid the pitfalls inherent in spatial subsampling mentioned above.
- the proposed method also advantageously takes into account the content of the video sequence for its analysis, the inventors having identified the problem homogeneity of the content mentioned above.
- the proposed method therefore facilitates the extrapolation of the statistics when a temporal subsampling is implemented during a phase of analysis of a video sequence.
- the proposed method can thus for example take into account the type of content (film, sport, musical show, etc.) analyzed in the context of the first pass of a multipass processing.
- the proposed method advantageously takes into account the type of treatment to be performed. on the video sequence (compression, denoising, etc.) during a processing phase using the analysis results generated during an analysis phase using the proposed method.
- the proposed method therefore has the advantage that it can be adapted according to the processing to be performed for a video sequence, taking into account for the analysis phase of the sequence of the type of processing (compression, filtering, etc.) carried out later. on the sequence.
- the proposed method is particularly well, although not exclusively, for encoding or compressing a video sequence according to a H.264, H. 265, H. 262, MPEG-2, AVC, or HEVC. But it is also suitable for encoding images according to any video encoding scheme in two passes (an analysis pass and an encoding pass), or for any treatment of a video sequence in two passes.
- the respective sizes of the subsequences and the respective gaps between two neighboring subsequences are determined according to the type of processing to be performed and according to the content of the video sequence.
- the subsequences may have an identical size, with the exception of the last subsequence of the plurality of consecutive subsequences.
- the size of the last sub-sequence will be chosen greater than or equal to the size of the other subsequences, whether the other subsequences have a single size or not.
- the proposed method further comprises generating, by extrapolation of the results of analysis of the subsequences of the video sequence, the results of analysis of the video sequence.
- At least one of the subsequences may contain only one image.
- the subsequences are further determined as a function of the analysis speed or the accuracy of the analysis.
- a device for analyzing a set of images of a video sequence for a processing to be performed on the sequence comprising an input interface configured to receive the video sequence, and a sequence analysis unit, comprising a processor operatively coupled to a memory, configured to determine in the video sequence a plurality of consecutive subsequences disjoint from one or more successive images, depending on the type of processing to be performed. perform and according to the content of the video sequence and analyze the images of each subsequence determined in the video sequence.
- a computer program loadable in a memory associated with a processor, and comprising portions of code for implementing the steps of the proposed method during the execution of said program by the processor, and a set of data representing, for example by compression or encoding, said computer program.
- Another aspect relates to a non-transient storage medium of a computer executable program, comprising a data set representing one or more programs, said one or more programs including instructions for executing said one or more programs.
- a computer comprising a processing unit operatively coupled to memory means and an input / output interface module, driving the computer to analyze the images of a video sequence according to the proposed method.
- FIG. 1 is a diagram illustrating the architecture of a video sequence analysis device according to an embodiment of the proposed method
- FIG. 2 is a diagram illustrating the architecture of a video sequence processing device according to an embodiment of the proposed method
- FIG. 3 is a diagram illustrating the proposed method according to one embodiment
- FIGS. 4a and 4b are diagrams illustrating subsampling of a video sequence according to an embodiment of the proposed method
- FIGS. 5a, 5b, 5c and 5d are diagrams illustrating subsampling of a video sequence according to an embodiment of the proposed method
- FIG. 6 shows an embodiment of a computer system for implementing the proposed method.
- subsampling is meant here any operation carrying out the extraction or selection of subsequences within a video sequence, without limitation relating to the particular method or to a particular sub-sampling parameter (period, recurrence , etc.), unless expressly stated. Subsampling is thus distinguished from decimation, which assumes regular extraction of subsequences (for example, extraction of an image every n frames of the video sequence).
- the video sequence analysis device 100 receives at input 102 an input video sequence 101 to be analyzed as part of a multi-pass processing.
- the analysis device 100 comprises a controller 103, operatively coupled to the input interface 102, which controls a subsampling unit 104 and an analysis unit 105.
- the data received on the interface of input 102 are input to the subsampling unit.
- the sub-sampling unit 104 sub-samples the video sequence according to the proposed method by determining in the video sequence a plurality of consecutive subsequences disjoint from one or more images successive.
- the unit 104 generates, after downsampling, data representing the plurality of consecutive and disjoint subsequences of images of the determined sequence, which are processed by the controller 103 which supplies, at the input of the analysis unit 105, the images of the plurality of disjoint sub-sequences of images of the video sequence selected by the sub-sampling unit 104.
- the analysis unit 105 generates, after analysis, images of the plurality of sub-sequences of images of the video sequence received as input of the statistical data 107 relating to the input video sequence 101, which is provided by the controller 103 on an output interface 106 of the analysis device 100.
- the generation of the statistical data 107 may include an extrapolation of extracted statistical data using image analysis results from the plurality of received image sequence subsequences. at the input to obtain analysis results for all the images of the video sequence, and not only for those that have actually been analyzed.
- extrapolation is meant here any operation for generating statistical data for the images that have not been analyzed (ie the images that have not been selected during the subsampling of the sequence) using in particular extracted statistical data for the scanned images (i.e., the images of the plurality of image subsequences of the video sequence).
- the analysis device 100 can thus output a set of statistics that do not show the subdivision into subsequences of the initial video sequence. Subsequent processing can then be carried out using the results of the analysis phase generated by the analysis device 100 without having to know the division of the video sequence produced by the analysis device 100.
- the controller 103 is configured to drive the subsampling unit 104 and the analysis unit 105, and in particular the inputs / outputs of these units.
- the architecture of the analysis device 100 illustrated in FIG. 1 is however not limiting.
- the input interface 102 of the analysis device 100 could be operably coupled to an input interface of the subsampling unit 104.
- the subsampling unit 104 could include an output operatively coupled to an input of the analysis unit 105
- the analysis unit 105 could include an output operatively coupled to the output interface 106.
- the analysis device100 can be a computer, a computer network, an electronic component, or other apparatus having a processor operatively coupled to a memory, and, depending on the embodiment selected, a data storage unit, and other associated hardware elements such as a network interface and a hardware interface. support reader for reading a non-transitory removable storage medium and writing on such a medium (not shown in the figure).
- the removable storage medium may be, for example, a compact disc (CD), a digital video / versatile disc (DVD), a flash disk, a USB key, etc.
- the memory, the data storage unit, or the removable storage medium contains instructions that, when executed by the controller 103, cause the controller 103 to perform or control the interface portions. 102, subsampling 104, analysis 105 and / or output interface 106 of the exemplary embodiments of the proposed method described herein.
- the controller 103 may be a component implementing a processor or a calculation unit for the image analysis according to the proposed method and the control of the units 102, 104, 105 and 106 of the analysis device 100.
- the analysis device 100 can thus be put into the form of software which, when loaded into a memory and executed by a processor, implements the analysis of a video sequence according to the proposed method.
- analysis device 100 can be implemented in software form, as described above, or in hardware form, as an application specific integrated circuit (ASIC), or in the form of a combination of hardware elements. and software, such as for example a software program intended to be loaded and executed on a FPGA (Field Programmable Gate Array) type component.
- FPGA Field Programmable Gate Array
- Fig. 2 is a diagram illustrating a video sequence processing device.
- the video sequence processing device 200 receives as input 202 an input video sequence 201 to be processed as part of a multi-pass processing.
- the analysis device 100 comprises a controller 203, operably coupled to the input interface 202, which controls a processing unit 204 and an analysis unit 205.
- the data received on the input interface 202 are inputted to the analysis unit 205.
- the output data 207 of the processing device 200 is generated on an output interface 206.
- the controller assembly 203, analysis unit 205 and input and output interfaces output 202/206 forms an analysis unit that can correspond to the analysis unit 100 described with reference to FIG. 1, and configured to implement the proposed method.
- the analysis unit 205 generates, after sub-sampling of the video sequence according to the proposed method, data representing a plurality of sub-sequences of images of the sequence determined by the sub-sampling, which are processed by the controller 203 or by a controller of the analysis unit, for analyzing the images of the subsequences.
- the sub-sampling subsequence image analysis results may be extrapolated to generate analysis results of all the images in the video sequence.
- the controller 203 is configured to drive the analysis unit 205 and the processing unit 204, and in particular the inputs / outputs of these units.
- the analysis results produced by the analysis unit 205 are provided, under the supervision of the controller 205, at the input of the processing unit 204 for processing the video sequence 201 of entry in the context of a multi-pass processing, the analysis performed by the analysis unit 205 corresponding to a first pass and the processing performed by the processing unit corresponding to a second pass.
- the architecture of the processing device 200 illustrated in FIG. 2 is however not limiting.
- the input interface 202 of the processing device 200 could be operably coupled to an input interface of the analysis unit 205 and to an input interface of the processing unit 204.
- the analysis unit 205 could include an output operably coupled to an input of the processing unit 204
- the processing unit 204 could include an output operably coupled to the output interface. 206 to produce data corresponding to the video sequence 201 processed.
- processing device 200 may be a multi-pass video encoder, a video denoising device, or any other multi-video video processing device in which at least one pass comprises an analysis of an input video sequence. prior to treatment.
- the processing device 200 may be a computer, a computer network, an electronic component, or another apparatus comprising a processor operatively coupled to a memory, and, depending on the embodiment chosen, a storage unit data, and other elements associated hardware such as a network interface and a media player for reading and writing removable media on such media (not shown in the figure).
- the memory, the data storage unit, or the removable storage medium contains instructions that, when executed by the controller 203, cause the controller 203 to perform or control the interface portions.
- the controller 203 may be a component implementing a processor or a calculation unit for image processing comprising an analysis according to the proposed method and the control of the units 202, 204, 205 and 206 of the processing device 200.
- processing device 200 may be implemented in software form, as described above, or in hardware form, such as an application specific integrated circuit (ASIC), or in the form of a combination of hardware and software, such as a software program intended to be loaded and executed on a FPGA (Field Programmable Gate Array) type component.
- ASIC application specific integrated circuit
- FPGA Field Programmable Gate Array
- Figure 3 shows a diagram illustrating the proposed method according to one embodiment.
- One or more parameters relating to the type of processing to be performed on the input video sequence are input (301) to be taken into account during the sub-sampling phase.
- one or more parameters relating to the content of the video sequence to be analyzed are entered (302) in order to be taken into account during the sub-sampling phase.
- the video sequence to be analyzed is inputted (303) for analysis.
- This video sequence is then subsampled (304) to determine a plurality of consecutive subsequences disjoint images of one or more successive images depending on the type of processing to be performed and the content of the video sequence, on the base of the parameter (s) relating to the type of processing and the parameter (s) relating to the content of the video sequence.
- each subsequence thus determined are then analyzed (305) according to a predetermined analysis method corresponding to the processing to be performed on the sequence, to provide sub-sequence analysis results.
- the image analysis of the subsequences can be followed by the extrapolation (306) of the results of this analysis to generate analysis results of all the images of the video sequence of 'Entrance.
- Figures 4a and 4b illustrate subsampling of a video sequence according to an embodiment of the proposed method.
- FIG. 4a schematically represents a video sequence (400) comprising a set of N images distributed over a duration D s between the first image (401) of the sequence and the last image (402) of the sequence.
- Figure 4b shows the video sequence (400) after subsampling.
- Sub-sampling of the sequence (400) made a determination of consecutive and disjoint subsequences (403a, 403b, 403c, 403d) of the sequence, sometimes referenced in the present application under the terms “chunk", "sub- together "or” package ".
- the subsequences (403a, 403b, 403c, 403d) determined are disjoint, in that two neighboring subsequences are respectively separated by "holes" (404a, 404b, 404c), each hole containing at least one image of the video sequence (400).
- holes (404a, 404b, 404c) correspond to the groups of images of the initial sequence (400) which, according to the proposed method, will not be analyzed.
- the subsequences (403a, 403b, 403c, 403d) determined are consecutive, in that they result from the sampling of a video sequence corresponding to a duration.
- the sampling of the video sequence may be performed according to the sequence of the sequence, to determine a sequence of subsequences, among which a first sequence corresponds to the beginning of the video sequence, and a last sequence corresponds to the end of the sequence. video sequence.
- the subsequences determined by the sub-sampling are not necessarily equal in size, in that they do not contain, for example, not all the same number of images.
- the subsequences 403a, 403b and 403c are of equal size, this size being less than or equal to that of the last subsequence (403d) of the sequence (400).
- the method realizes a temporal division into "chunks" of the video sequence to be analyzed.
- Chunks are subsequences of consecutive images that do not necessarily contain the same number of images. Images that are not in chunks will not be scanned.
- the respective sizes of the video sequence and the subsequences can be expressed in number of images, or in the form of a duration, the two measurements being linked by the number of images per second of the sequence video considered. It is the same of the difference between two neighboring subsequences, which can be, according to the implementation, expressed in number of images or as a duration.
- Subsampling can be done on the basis of different parameters, depending on the implementation.
- the sub-sequences derived from the sub-sampling can be determined as a function of a subsequence size, for example expressed by a number of images, which is identical for all sub-sequences, if any. the exception of a subsequence (preferably the last one), and a sub-sampling frequency or a subsampling period defining the difference between two neighboring subsequences.
- the subsequences can be determined according to a subsequence size and a subsampling rate.
- sub-sequences can be determined by setting a single size (possibly with the exception of the last subsequence) equal to one image, and a sub-sampling rate of 1/6.
- the subsequences will therefore be determined by selecting an image every 6 images of the video sequence to be processed for analysis. For a video sequence lasting one hour, the analysis time will be reduced to 10 minutes. Analysis results for the unanalyzed (i.e., non-subsequence) portions may be inferred from the subsequence analysis results, for example by an extrapolation method.
- the sub-sequences derived from the sub-sampling can be determined as a function of a subsequence size, for example expressed by a number of images, which is identical for all sub-sequences, if any. except for a subsequence, a number of subsequences, and the size of the sequence.
- the proposed method may determine downsampling parameters taking into account the type of processing to be performed and the content of the video sequence to be processed.
- the size of the last subsequence, corresponding to the end of the video sequence to be processed may be chosen greater than or equal to those of the other subsequences.
- the proposed method could merge the last two subsequences in the case where the size of the last subsequence would otherwise be less than a predetermined threshold, the size of the other subsequences if it is unique, or the size of at least one other subsequence.
- Figure 5a illustrates the case of a subsampling period (501) containing a chunk (500) positioned at the beginning of the period.
- Figure 5b illustrates the case of a subsampling period (503) containing a chunk (502) positioned at the end of the period.
- the size of the last chunk of the sequence can be chosen greater than or equal to that of the other chunks.
- the last chunk (504) can be extended to cover the entire period (505), so as to ensure a more accurate analysis of the end. of the video sequence.
- the last chunk (506) can be extended to cover the entire period (507), so as to ensure a more accurate analysis of the end of the video sequence.
- the distribution of the chunks as well as their size expressed in number of images makes it possible to deduce a speed gain of the analysis method. Indeed, if the computational load is proportional to the number of images analyzed (which is generally the case in most applications), the proportion of images constituting the chunks can directly provide an estimate of this gain.
- the size of the chunks and their distribution is dependent on the application (of the processing performed on the sequence). If, depending on the application, the analysis method uses temporal information such as the movement of objects in the scene, the chunks will preferably consist of a significant number of consecutive images. For example, a chunk can be considered large when it includes several thousand consecutive images. For example, as part of an analysis for a video compression application, some chunks might have about 3000 images. On the contrary, if the analysis processes the images independently, such as for example a brightness measurement, the chunks may be reduced in size, and for example be reduced to a single image.
- the analysis results are extrapolated for sub-sequences of images that are not analyzed. It is preferable in this case that the skipped images be of the same nature as the analyzed images from the point of view of the statistics returned. Thanks to this extrapolation, the analysis method can provide the next processing stage with a complete set of statistics, that is to say including statistics corresponding to the images that have not been analyzed.
- statistics can be extrapolated linearly. We will estimate that their values are placed on a line that connects the last image of a chunk to the first image of the next chunk.
- the size of the last chunk will preferably be chosen greater than or equal to that of the other chunks, and for example large enough to correct the analysis errors of the input video sequence.
- the size of this end analysis window can indeed influence the quality of analysis of all the content, and it is preferable to choose a sufficient size to be able to compensate for the analysis errors at the beginning of the sequence attributable to the downsampling and at the chosen size for the start analysis windows of the sequence.
- the sub-sampling operation of a video sequence taking into account the type of processing to be performed and the content of the video sequence is illustrated by the following two examples: the filtering of a video sequence in the context of video compression , and video compression in two passes.
- the video quality will be higher as the defects inherent in the compression are uniform during the compressed video sequence.
- it should preferably be removed homogeneously throughout the video sequence denoising.
- a denoising method consists in analyzing all the video sequence beforehand so as to identify noise characteristics (for example statistical noise characteristics), and then, in a second step, to filter using the characteristics collected. reduce the noise.
- This denoising method thus comprises a first phase of noise analysis, followed by denoising treatment, for example by noise filtering, which uses the results. of the preliminary analysis phase.
- different characteristics can be acquired during the preliminary analysis phase, such as for example the noise energy and its spectral amplitude when it is considered to be a Gaussian additive white noise.
- the content type of the video sequence to be denoised is taken into account in order to determine in the video sequence a plurality of consecutive subsequences disjoined from one or more successive images, during the phase analysis prior to denoising.
- the noise to be removed will typically be film grain (film or artificially added in the case of digital cinema).
- the characteristics of this noise can be considered as being homogeneous throughout the film.
- it is not added in a linear way. It can also be more important in dark scenes than in bright scenes for example.
- nonlinear parameters can be calculated. They can allow a complete identification of the noise, using for example the technique described in the US patent application US 2004/005365.
- non-linear parameters such as those transported in the H.264 and HEVC video compression standards may be calculated. These can indeed be used to model multiplicative noises with coefficients depending on the luminous intensity.
- a video sequence whose content is a film of cinema
- measures of statistical characteristics of the noise preferably at regular intervals, using -Sequences, or "chunks", of some images.
- Subsampling of the video sequence is performed by determining in the video sequence a plurality of consecutive subsequences disjoined, the subsequences comprising one or more successive images. These subsequences are determined according to the type of processing to be performed (in this example a noise filtering) and according to the content of the video sequence (in this example a movie film).
- the size (expressed in number of frames or time unit) of the last subsequence, as well as the difference between the latter sub-sequence and the previous sub-sequence, may be chosen differently.
- other subsequences determined by downsampling may be chosen differently.
- Nc size of size 1
- Nc size greater than or equal to Nc 2
- Sampling can be done every second which will allow to calculate the total number of chunks according to the total duration D s of the video sequence.
- an extrapolation of the chunks analysis results is performed.
- Different methods of extrapolation of the acquired statistics can, if necessary, be used. For example, one can consider that these statistics on the unanalyzed images are identical to those of the images of the ends of the adjacent chunks. This method is particularly valid when chunks are composed of few images, as is the case in the embodiment described above where they are limited to 4 images.
- the analysis method may be configured to perform measurements on 4 consecutive images (the chunks will therefore be of equal size 4 images, possibly with the exception of the last chunk) all 24 images that represent a second of film. The analysis is thus carried out 6 times faster than if all the images of the sequence were analyzed.
- the division of the sequence is adapted to the content as to the type of processing performed later. Indeed, the period of chunks as their durations proposed above are well adapted to video sequences of film type.
- the invention makes it possible to optimize the analysis speed according to the type of content, which is an important advantage in the case of batch processes.
- Two-pass video compression usually aims to ensure consistent video quality over the entire compressed video sequence, while maintaining a specified total size.
- the first pass is an analysis phase, having for example the purpose of determining the complexity of the sequence.
- the second pass is the actual compression phase. The latter uses the results of the analysis phase, in the above example the complexities obtained, to optimize the flow locally and maintain both a constant quality and a total size.
- the analysis phase may comprise determining each image a relationship between a compression parameter, a quantizer, and a corresponding bit rate. This relationship is generally obtained by compressing the video sequence, so that the analysis phase proves to be almost as expensive as the processing phase (compression) itself.
- the statistical summary produced at the end of the analysis phase is constituted for each image of a quantizer relationship, Flow.
- it is determined in the video sequence to be compressed a plurality of consecutive sub-sequences disjoint from one or more successive images.
- the images of each subsequence are analyzed to produce, for each image of each sub-sequence analyzed, Quantifier relationship, Flow.
- statistics of unparsed portions may be generated by extrapolation using the subsequence analysis results.
- subsampling is carried out which takes into account the type of treatment to be performed (in this example a compression) and the contents of the video sequence.
- subsampling parameters are selected taking into account the type of processing to be performed and the content of the video sequence to be processed.
- the video sequence to be compressed is a movie film (for example in the context of a video-on-demand application)
- the relationship between the compression parameter (s) and the bit rate obtained varies little.
- the films have substantially homogeneous characteristics, at least locally. Therefore, it is not necessary to perform the analysis on the entire film because sub-sequences of a few seconds correctly distributed enough to provide the necessary information for the second pass.
- the video sequence containing a film may for example be cut into subsequences of duration equal to ten seconds, the analysis being performed only every minute of content, which determines the difference between two neighboring subsequences.
- Analysis results for example complexity
- the extreme end of the sequence is completely analyzed, choosing a size for the last subsequence larger than that of the other subsequences.
- the last subsequence can be chosen corresponding to the last minutes of the sequence, in the case where the content thereof is a film. This last sequence can in particular be composed of the last two minutes of the sequence, ie about 3000 images.
- an objective of speed of the analysis, precision of the analysis, or a criterion representing a compromise between the speed and accuracy of the analysis is also taken into account. . It is indeed possible to modify the distribution of the zones analyzed to further accelerate the analysis phase, or improve the accuracy of the analysis. For example, a compromise is made between the speed of the analysis and its accuracy, and the distribution of the subsequences determined according to the type of processing to be performed and according to the content of the video sequence to be processed is modified according to whether we want to focus on the speed of analysis, or the quality of it.
- the duration of the respective subsequences determined according to the type of processing to be performed and according to the content of the video sequence to be processed can be reduced, for example by a predetermined factor corresponding to the gain. of analysis speed sought. In a particular embodiment, this can be achieved by reducing the duration of the sub-sequences analyzed by half, for example by reducing the duration of the subsequences from 10 s to 5 s.
- the respective duration sub-sequences determined according to the type of processing to be performed and according to the content of the video sequence to be processed can be increased, for example by a predetermined factor corresponding to the gain of analysis accuracy sought. In a particular embodiment, this can be obtained by increasing the duration of the sub-sequences analyzed or by increasing the duration of the final sub-sequence.
- Embodiments of the method of analyzing a video sequence may be, at least in part, implemented on virtually any type of computer, regardless of the platform used.
- a computer system 600
- Figs. 6 a computer system (600)
- Figs. 6 which may correspond to the video sequence analysis and video sequence processing units shown in Figs.
- a data processing unit (601) which comprises one or more processors (602), such as a central processing unit (CPU) or another hardware processor, an associated memory (603) (for example, a random access memory (RAM), a cache memory, a flash memory, etc.), a storage device (604) (for example a hard disk, an optical disk such as a CD or a DVD, a flash memory key, etc.), and many other elements and features typical of current computers (not shown).
- processors (602) such as a central processing unit (CPU) or another hardware processor, an associated memory (603) (for example, a random access memory (RAM), a cache memory, a flash memory, etc.), a storage device (604) (for example a hard disk, an optical disk such as a CD or a DVD, a flash memory key, etc.), and many other elements and features typical of current computers (not shown).
- processors such as a central processing unit (CPU) or another hardware processor
- an associated memory for example, a random access memory (
- the data processing unit (601) also comprises an input / output interface module (605) which controls the different interfaces between the unit (601) and input and / or output means of the system (600). ).
- the system (600) may indeed also include input means, such as a keyboard (606), a mouse (607), or a microphone (not shown).
- the computer (600) may include output means, such as a monitor (608) (for example, a liquid crystal display (LCD) monitor, an LED display monitor, or a tube monitor cathodic (CRT)).
- a monitor 608 (for example, a liquid crystal display (LCD) monitor, an LED display monitor, or a tube monitor cathodic (CRT)).
- the computer system (600) can be connected to a network (609) (for example, a local area network (LAN), a wide area network (WAN) such as the Internet, or any other similar type of network) via a network interface connection (not shown).
- a network for example, a local area network (LAN), a wide area network (WAN) such as the Internet, or any other similar type of network
- LAN local area network
- WAN wide area network
- the Internet Internet
- the computer system (600) comprises at least the minimal means of processing, input and / or output necessary to practice one or more embodiments of the proposed analysis method.
- the processor (602) is adapted to be configured to execute a computer program including portions of code for implementation an analyzer, configured to perform the analysis of an input video sequence according to the different embodiments of the proposed analysis method.
- the storage device (604) will preferably be chosen to store the data corresponding to the results of the analysis and processing of the video sequence.
- one or more elements of the aforementioned computer system (600) may be at a remote location and be connected to other elements on a network.
- one or more embodiments may be implemented on a distributed system having a plurality of nodes, where each portion of the implementation may be located on a different node within the distributed system.
- the node corresponds to a computer system.
- the node may correspond to a processor with associated physical memory.
- the node may also correspond to a processor with shared memory and / or shared resources.
- software instructions for performing one or more embodiments may be stored on a computer-readable non-transitory medium such as a compact disc (CD), floppy disk, tape, or any other readable storage device. computer.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Image Analysis (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR1553792A FR3035729B1 (en) | 2015-04-28 | 2015-04-28 | METHOD OF ANALYSIS OF A VIDEO SEQUENCE AND EQUIPMENT FOR THE IMPLEMENTATION OF THE PROCESS |
PCT/FR2016/050911 WO2016174329A1 (en) | 2015-04-28 | 2016-04-20 | Method for analysing a video sequence and equipment for implementing said method |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3289763A1 true EP3289763A1 (en) | 2018-03-07 |
Family
ID=53404776
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP16721199.4A Ceased EP3289763A1 (en) | 2015-04-28 | 2016-04-20 | Method for analysing a video sequence and equipment for implementing said method |
Country Status (4)
Country | Link |
---|---|
US (1) | US10742987B2 (en) |
EP (1) | EP3289763A1 (en) |
FR (1) | FR3035729B1 (en) |
WO (1) | WO2016174329A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111133761B (en) * | 2017-09-22 | 2021-05-14 | 杜比实验室特许公司 | Method, apparatus and system for encoding or decoding video data using image metadata |
CN113298225A (en) * | 2020-09-03 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method, audio noise reduction method and neural network model |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006012384A2 (en) * | 2004-07-20 | 2006-02-02 | Qualcomm Incorporated | Method and apparatus for encoder assisted-frame rate up conversion (ea-fruc) for video compression |
US20070116126A1 (en) * | 2005-11-18 | 2007-05-24 | Apple Computer, Inc. | Multipass video encoding and rate control using subsampling of frames |
WO2013049412A2 (en) * | 2011-09-29 | 2013-04-04 | Dolby Laboratories Licensing Corporation | Reduced complexity motion compensated temporal processing |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1369116A1 (en) | 1994-11-11 | 2003-12-10 | Debiopharm S.A. | Oxalatoplatin and 5-fluorouracil for combination therapy of cancer |
US6892193B2 (en) * | 2001-05-10 | 2005-05-10 | International Business Machines Corporation | Method and apparatus for inducing classifiers for multimedia based on unified representation of features reflecting disparate modalities |
US6993535B2 (en) * | 2001-06-18 | 2006-01-31 | International Business Machines Corporation | Business method and apparatus for employing induced multimedia classifiers based on unified representation of features reflecting disparate modalities |
EP2087739A2 (en) | 2006-10-25 | 2009-08-12 | Thomson Licensing | Methods and apparatus for efficient first-pass encoding in a multi-pass encoder |
KR20150080278A (en) * | 2013-12-31 | 2015-07-09 | 주식회사 케이티 | Apparatus for providing advertisement content and video content and method thereof |
-
2015
- 2015-04-28 FR FR1553792A patent/FR3035729B1/en active Active
-
2016
- 2016-04-20 WO PCT/FR2016/050911 patent/WO2016174329A1/en active Application Filing
- 2016-04-20 EP EP16721199.4A patent/EP3289763A1/en not_active Ceased
- 2016-04-20 US US15/569,446 patent/US10742987B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006012384A2 (en) * | 2004-07-20 | 2006-02-02 | Qualcomm Incorporated | Method and apparatus for encoder assisted-frame rate up conversion (ea-fruc) for video compression |
US20070116126A1 (en) * | 2005-11-18 | 2007-05-24 | Apple Computer, Inc. | Multipass video encoding and rate control using subsampling of frames |
WO2013049412A2 (en) * | 2011-09-29 | 2013-04-04 | Dolby Laboratories Licensing Corporation | Reduced complexity motion compensated temporal processing |
Non-Patent Citations (1)
Title |
---|
See also references of WO2016174329A1 * |
Also Published As
Publication number | Publication date |
---|---|
US10742987B2 (en) | 2020-08-11 |
WO2016174329A1 (en) | 2016-11-03 |
FR3035729A1 (en) | 2016-11-04 |
US20180302627A1 (en) | 2018-10-18 |
FR3035729B1 (en) | 2020-11-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3225029B1 (en) | Image encoding method and equipment for implementing the method | |
EP3340169B1 (en) | Hybrid image and video denoising based on interest metrics | |
EP3571834B1 (en) | Adaptive generation of a high dynamic range image of a scene, on the basis of a plurality of images obtained by non-destructive reading of an image sensor | |
EP3490255B1 (en) | Intelligent compression of grainy video content | |
FR2912237A1 (en) | IMAGE PROCESSING METHOD | |
FR3098072A1 (en) | Process for processing a set of images from a video sequence | |
EP3289763A1 (en) | Method for analysing a video sequence and equipment for implementing said method | |
WO2013068687A1 (en) | Method and device for estimating a degree of porosity of a sample of material on the basis of at least one image coded by grey levels | |
FR2996034A1 (en) | Method for generating high dynamic range image representing scene in e.g. digital still camera, involves generating composite images by superposition of obtained images, and generating high dynamic range image using composite images | |
EP3449634A1 (en) | Method for the contextual composition of an intermediate video representation | |
FR3002062A1 (en) | SYSTEM AND METHOD FOR DYNAMICALLY REDUCING ENTROPY OF A SIGNAL BEFORE A DATA COMPRESSION DEVICE. | |
EP2887307B1 (en) | Image-processing method, in particular for images from night-vision systems and associated system | |
EP3797509B1 (en) | Processing of impulse noise in a video sequence | |
FR3020736A1 (en) | METHOD FOR QUALITY EVALUATION OF A SEQUENCE OF DIGITAL DECODE DIGITAL IMAGES, ENCODING METHOD, DEVICES AND COMPUTER PROGRAMS | |
EP2943935B1 (en) | Estimation of the movement of an image | |
EP4150574B1 (en) | Method for processing images | |
EP3130144B1 (en) | Method for calibrating a digital imager | |
FR3102026A1 (en) | SEMANTICALLY SEGMENTED VIDEO IMAGE COMPRESSION | |
FR3066633A1 (en) | METHOD FOR DEFLOWING AN IMAGE | |
FR2957744A1 (en) | METHOD FOR PROCESSING A VIDEO SEQUENCE AND ASSOCIATED DEVICE | |
FR3123734A1 (en) | Pixel data processing method, device and corresponding program | |
FR3116364A1 (en) | Visualization aid solution to simulate a self-exposure process | |
FR3103302A1 (en) | IMAGE SEGMENTATION BY OPTICAL FLOW | |
FR3118558A1 (en) | Method for calibrating an array of photodetectors, calibration device and associated imaging system | |
Shrestha et al. | Video quality analysis for concert video mashup generation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20171013 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20180828 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20211111 |