Nothing Special   »   [go: up one dir, main page]

WO2016111239A1 - Dispositif de traitement d'images, procédé de traitement d'images et support d'enregistrement de programme - Google Patents

Dispositif de traitement d'images, procédé de traitement d'images et support d'enregistrement de programme Download PDF

Info

Publication number
WO2016111239A1
WO2016111239A1 PCT/JP2016/000013 JP2016000013W WO2016111239A1 WO 2016111239 A1 WO2016111239 A1 WO 2016111239A1 JP 2016000013 W JP2016000013 W JP 2016000013W WO 2016111239 A1 WO2016111239 A1 WO 2016111239A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame image
image
frame
movement amount
interest
Prior art date
Application number
PCT/JP2016/000013
Other languages
English (en)
Japanese (ja)
Inventor
真澄 石川
仁 河村
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2016568360A priority Critical patent/JP6708131B2/ja
Publication of WO2016111239A1 publication Critical patent/WO2016111239A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory

Definitions

  • the present invention relates to a video processing device, a video processing method, and a program recording medium.
  • Photosensitivity seizure is one of the symptoms of an abnormal response to light stimulation, and is a seizure showing symptoms similar to epilepsy such as convulsions and disturbance of consciousness.
  • Non-Patent Document 1 In order to suppress the occurrence of such effects, attempts are being made to suppress the distribution of video content that has a negative effect on the human body.
  • ITU International Telecommunication Union
  • Non-Patent Document 2 In Japan, the Japan Broadcasting Corporation and the Japan Broadcasting Corporation have established guidelines for animation production in particular, and are demanding compliance with those who are involved in broadcasting (Non-Patent Document 2).
  • One video that contains many flickers that can trigger photosensitivity seizures is a video that contains a lot of flash emitted from a news photographer during a press conference. In such an image, a short-time bright region is generated by a flash emitted from the camera, and many blinks are generated by repeating this.
  • Patent Documents 1 to 3 disclose related techniques for detecting and correcting video content that has an adverse effect on the human body.
  • Patent Document 1 discloses a technique for detecting a scene (image) that induces a light-sensitive seizure in a liquid crystal display and reducing the luminance of a backlight unit with respect to the detected scene. This technology obviates the effects of photosensitivity attacks on viewers.
  • Patent Document 2 corrects the dynamic range of the (n + 1) th frame image by gamma correction or tone curve correction based on the comparison result of the histograms of the nth frame image and the (n + 1) th frame image.
  • the technology is disclosed. This technique relieves strong blinking and reduces eye strain or poor health.
  • Patent Document 3 discloses a technique for correcting a motion vector.
  • Non-Patent Document 3 and Non-Patent Document 4 disclose optical flow calculation methods described later.
  • the related technology has the following problems. Large changes in brightness or saturation that can trigger photosensitivity seizures may occur in some areas of the image, not in the entire image. Since the technique disclosed in the related art uniformly corrects the entire image without making these determinations, it reduces the contrast and brightness of areas that do not need to be corrected and does not cause blinking, and reduces the image quality of those areas. May deteriorate.
  • An object of the present invention is to provide a technique capable of generating a natural video in which fluctuations in luminance or saturation are suppressed.
  • An image processing device includes: Determination means for determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region whose luminance or saturation differs by a predetermined level or more with respect to the preceding and following frame images; Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Motion estimation means for estimating a second movement amount to be Image generation for generating a correction frame image corresponding to a frame image at the shooting time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount Means, Image synthesizing means for synthesizing the frame image of interest and the correction frame image.
  • An image processing method includes: It is determined whether any of a plurality of temporally continuous frame images is an attention frame image including a blinking region whose luminance or saturation is different from a preceding frame image by a predetermined level or more. Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Estimating a second movement amount to be Based on the selected pair and the estimated first movement amount and / or second movement amount, a corrected frame image corresponding to a frame image at the shooting time of the frame image of interest is generated, The attention frame image and the correction frame image are synthesized.
  • a program recording medium includes: On the computer, A process of determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region that differs in luminance or saturation by a predetermined level or more with respect to the preceding and following frame images; Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject A process of estimating a second movement amount to be performed; Processing for generating a corrected frame image corresponding to a frame image at the photographing time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount; , And a process of combining the frame image of interest and the correction frame image.
  • An image processing device includes: Selection means for selecting a first frame image and a second frame image from a plurality of temporally continuous frame images; A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated.
  • First estimating means for: A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area.
  • second estimation means for estimating a second movement amount resulting from the movement of.
  • FIG. 1 is a block diagram of a video processing apparatus according to the first embodiment.
  • FIG. 2 is a schematic diagram showing a rectangular area luminance calculation method.
  • FIG. 3 is a block diagram of the motion estimator in the first embodiment.
  • FIG. 4 is a schematic diagram illustrating a method of selecting a frame image that does not include a bright region.
  • FIG. 5 is a diagram illustrating a method for selecting a motion estimation frame.
  • FIG. 6 is a diagram illustrating a method for selecting a motion estimation frame.
  • FIG. 7 is a diagram illustrating an example of a method for selecting a motion estimation frame pair.
  • FIG. 8 is a block diagram of a correction frame generation unit in the first embodiment.
  • FIG. 1 is a block diagram of a video processing apparatus according to the first embodiment.
  • FIG. 2 is a schematic diagram showing a rectangular area luminance calculation method.
  • FIG. 3 is a block diagram of the motion estimator in the first embodiment.
  • FIG. 4 is a schematic diagram
  • FIG. 9 is a graph showing an example of a method for setting the value of the rate of change in local area luminance in the output frame image.
  • FIG. 10 is a flowchart showing the operation of the video processing apparatus according to the first embodiment.
  • FIG. 11 is a block diagram illustrating a hardware configuration of the computer apparatus.
  • FIG. 1 is a block diagram showing a configuration of a video processing apparatus 100 according to the first embodiment of the present invention. Note that the arrows described in FIG. 1 (and the subsequent block diagrams) merely show an example of the data flow, and are not intended to limit the data flow.
  • the video processing apparatus 100 includes a determination unit 11, a motion estimation unit 12, an image generation unit 13, and an image synthesis unit 14.
  • the determination unit 11 determines whether or not the frame image includes a region that may induce a photosensitivity seizure. Specifically, the determination unit 11 uses a frame image having a preset number of frames to blink a specific frame image (hereinafter referred to as “target frame image”) by flash or the like (the luminance changes greatly). It is determined whether the frame image includes a region. In the following, a region determined in this way (a region where the luminance changes greatly) is referred to as a “blinking region”. For example, when the determination unit 11 receives an input of time-sequential frame images for (2m + 1) frames taken from time (tm) to time (t + m), the determination unit 11 selects the frame image at time t. A frame image of interest is determined, and it is determined whether the frame image includes a blink region.
  • the motion estimation unit 12, the image generation unit 13, and the image synthesis unit 14 synthesize a frame image in which the movement of the image due to the displacement of the camera or the subject is corrected.
  • the motion estimation unit 12, the image generation unit 13, and the image synthesis unit 14 can output a frame image in which the influence of blinking is reduced by appropriately suppressing the luminance change in the blinking region in this way.
  • the blinking region includes a bright region in which the luminance of the frame image of interest is greatly improved (becomes brighter) and a dark region in which the luminance of the frame image of interest is greatly lowered (becomes dark).
  • a bright region in which the luminance of the frame image of interest is greatly improved (becomes brighter)
  • a dark region in which the luminance of the frame image of interest is greatly lowered (becomes dark).
  • only the bright area will be described below.
  • the determination unit 11 determines whether the target frame image is a frame image including a blinking region.
  • One method for determining whether a target frame image is a frame including a blinking region is a method using a change rate of local region luminance between the target frame image and another input frame image.
  • the local area luminance represents a luminance value of an area including the pixel and a predetermined number of pixels around the pixel in each pixel of the input plurality of frame images.
  • the determination unit 11 first converts color information described in an RGB color system or the like into luminance information (luminance value) representing brightness for each pixel of the input plurality of frame images. Thereafter, the determination unit 11 performs a smoothing process using pixels around the target pixel on the converted luminance information, thereby calculating a luminance value in the pixel peripheral region.
  • the method for converting color information into luminance information is, for example, a method for calculating a Y value representing the luminance of the YUV (YCbCr, YPbPr) color system used for broadcasting, or a Y value representing the luminance of the XYZ color system.
  • the color systems describing luminance information are not limited to these color systems.
  • the determination unit 11 may convert the color information into another index representing luminance such as the V value of the HSV color system.
  • the determination unit 11 converts the color information into color information before correction by inverse gamma correction before conversion to luminance information. Also good.
  • the smoothing method is, for example, an average value of luminance information of upper and lower q pixels and left and right p pixels, that is, (2p + 1) ⁇ (2q + 1) pixels, of pixels around the target pixel.
  • the local region luminance l t (x, y) of the pixel at the position (x, y) in the frame image at time t is expressed by Equation (1) using the luminance information Y t of the frame image.
  • the determination unit 11 may calculate the local region luminance l t (x, y) using a weighted average using a preset weight w as in Expression (2).
  • the determination unit 11 calculates a Gaussian weight w (i, j) using Equation (3) using a preset parameter ⁇ .
  • the local area luminance change rate represents the ratio of the local area luminance change between the pixel of the target frame image and the pixel of another input frame image at the same position.
  • the determination unit 11 changes the local area luminance change rate r t ⁇ t + k (x, y) of the pixel at each position (x, y) of the frame image of interest at time t and the frame image at time (t + k). ) Is calculated using equation (4).
  • the determination unit 11 determines based on the calculated change rate whether or not the frame image of interest includes an area that is brighter than a predetermined level by comparison with other frame images. As a result, when the attention frame image includes an area that is brighter than a predetermined level with respect to other frame images before and after in time, the determination unit 11 determines that the attention frame image is a bright area due to blinking. It is determined that the frame image is included.
  • the determination unit 11 uses the threshold value ⁇ of the change rate and the threshold value ⁇ of the area rate, which are set in advance, depending on whether the area rate of the region where the change rate r t-t + k exceeds the threshold value ⁇ exceeds the threshold value ⁇ .
  • the determination unit 11 sets the determination flag flag t-t + k to “1” when it is determined that the frame image of interest at time t includes a region that is brighter than a predetermined level by the frame image at time (t + k). . If the determination unit 11 determines that there is no such area, the determination flag flag t-t + k is set to “0”. The determination unit 11 similarly calculates a determination flag for the combination of the target frame image and all the other input frame images, and the frame image for which the determination flag is “1” for each of the times before and after the target frame image. It is determined whether or not exists. When such a frame image exists, the determination unit 11 determines that the frame image of interest is a frame image including a bright region.
  • the determination unit 11 may use a method of using the change rate of the rectangular area luminance as another method of determining whether the frame image of interest is a frame image including a blinking area.
  • the rectangular area luminance represents an average value of luminance for each rectangular area set in advance in each frame image.
  • the rectangular area luminance when a 10 ⁇ 10 block rectangular area is set in the frame image is an average value of the luminance values of the pixels included in each rectangular area.
  • the luminance value the Y value of the YUV color system, the Y value of the XYZ color system, the V value of the HSV color system, etc. can be used as in the case of calculating the local area luminance.
  • the change rate of the rectangular area luminance represents the ratio of the difference between the rectangular area luminance of the block of interest in the target frame image and the rectangular area luminance of the block at the same position in the other input frame image.
  • the determination unit 11 determines the rectangular area luminance L t (i, j) at the time t of the block at the position (i, j) of the target frame image and the rectangular area luminance L t + k of the frame image at the time (t + k).
  • the change rate R t ⁇ t + k (i, j) of (i, j) is calculated using equation (5).
  • the determination using the change rate of the rectangular area luminance is performed in the same manner as the determination using the change ratio of the local area luminance.
  • the determination unit 11 determines whether or not the attention frame image includes a region that is brighter than the other frame images in the combination of the attention frame image at time t and all the other input frame images. Set the value of the judgment flag.
  • the determination unit 11 determines that the frame image of interest is a frame image including a blinking area when there are frame images having the determination flag “1” at each of the times before and after the frame of interest image.
  • the determination flag value setting method as in the case of using the local area luminance change rate, a pixel whose change rate exceeds the threshold value ⁇ using a preset change rate threshold value ⁇ and an area rate threshold value ⁇ . There is a method of setting “1” or “0” depending on whether or not the area ratio exceeds the threshold value ⁇ .
  • the determination unit 11 outputs a determination flag between the frame image of interest and another input frame image as analysis information together with the determination result. Moreover, the determination part 11 may output the determination flag calculated between frame images other than an attention frame image as auxiliary information by performing the same process.
  • the determination unit 11 calculates the rectangular area luminance calculated between each rectangular area of the target frame image and the rectangular area at the same position of the other frame image. May be output as analysis information.
  • FIG. 3 is a block diagram illustrating a configuration of the motion estimation unit 12.
  • the motion estimation unit 12 includes a selection unit 12A, a first estimation unit 12B, and a second estimation unit 12C.
  • the motion estimation unit 12 receives the frame image and the determination result and analysis information output from the determination unit 11 as inputs. When it is determined that the target frame image is a frame image including a bright region, the motion estimation unit 12 selects a plurality of frame images to be used for motion estimation from the input frame images, and selects between the selected frame images. The movement amount of the image due to the movement of the camera and the subject is estimated.
  • the selection unit 12A selects a frame image used for estimation of the movement amount from frame images other than the target frame image, and acquires a pair of frame images including the selected frame image.
  • the selection unit 12A selects these frame images (hereinafter referred to as “motion estimation frame images”), for example, by the following method.
  • the selection unit 12A may select one frame image as a motion estimation frame image from before and after the target frame image based on the luminance difference between the target frame image and the input other frame image. In this case, the selection unit 12A acquires one frame image before and after each frame image of interest and uses it as a pair of motion estimation frame images. Specifically, the selection unit 12A may select the motion estimation frame image using the determination flag calculated by the determination unit 11.
  • the frame images before and after the closest to the target frame image are selected as the motion estimation frame images.
  • FIG. 4 is a schematic diagram showing a method for selecting a frame image that does not include a bright region.
  • Figure 4 shows the case of comparing the frame image from time (t-2) to time (t + 2) with other frame images for the frame image at time t for the four types of cases 1 to 4 The determination flag (flag) is illustrated. Note that in FIG. 4 (and similar figures thereafter), frame images that do not include a bright region are shown with hatching. An unhatched frame image represents a frame image including a bright region.
  • the selection unit 12A selects a frame image at time (t ⁇ 1) and time (t + 1) in case 1. Similarly, the selection unit 12A displays frame images at time (t-2) and time (t + 1) in case 2, and frames at time (t-1) and time (t + 2) in case 3. In the case of an image, case4, frame images at time (t-2) and time (t + 2) are selected.
  • the selection unit 12A may correct the selection result of the motion estimation frame using the determination flag between the frame images other than the target frame image input as the auxiliary information. In the selection using the determination flag between the frame image of interest and another frame image, when the frame at time (t + k) is selected as the motion estimation frame, the selection unit 12A corrects the selection result as follows. May be.
  • the determination flag flag t-t + k of the frame image at the time (t + k + 1) and the determination flag of the frame image at the time (t + k + 1) and the frame image at the time (t + k) When both flag t + k + 1-t + k values are ⁇ 1 '', there is also a large luminance change between the frame image at time (t + k + 1) and the frame image at time (t + k). It is believed that there is. Therefore, in this case, the selection unit 12A may change (correct) the motion estimation frame image to a frame image at time (t + k + 1).
  • the selection unit 12A may select a plurality of frame images as the motion estimation frame images from before and after the target frame image based on the luminance change between the target frame image and the input other frame image. In this case, the selection unit 12A acquires a plurality of pairs of frame images. Specifically, the selection unit 12A may select a predetermined number of frame images having the determination flag calculated by the determination unit 11 out of the neighboring frame images of the target frame image.
  • FIG. 5 is a schematic diagram showing an example of selecting a plurality (two pairs in this case) of motion estimation frame images.
  • the selection unit 12A when the frame images at times (t ⁇ 2), (t ⁇ 1), (t + 1), and (t + 2) do not include a bright region, the selection unit 12A All of these frame images are selected as motion estimation frames.
  • the determination flag in the example of FIG. 5 is equal to the determination flag in the case 1 of FIG.
  • the selection unit 12A not only displays frame images at time (t-1) and time (t + 1) but also frame images at time (t-2) and time (t + 2). Select as a frame for motion estimation.
  • This selection method selectively uses an area that is less affected by light flickering from multiple frame images when frequent flickering occurs in a short time or when a flash band occurs.
  • the accuracy can be increased (see, for example, FIG. 7).
  • the flash band refers to the difference in exposure period for each line when light emission in a short time such as flash light occurs in a rolling shutter type imaging device such as a CMOS (Complementary Metal-Oxide-Semiconductor) sensor. This is a large change (shift) in the signal intensity that occurs.
  • CMOS Complementary Metal-Oxide-Semiconductor
  • the selection unit 12A selects, as a motion estimation frame image, one of the frame images before and after the target frame image and the target frame image based on the luminance difference between the target frame image and the input other frame image. May be. Specifically, the selection unit 12A may select a frame image closest to the target frame image from among the frames having the determination flag “1” calculated by the determination unit 11. When the determination flag is “1” both before and after the frame image of interest, the selection unit 12A selects only one preset frame.
  • FIG. 6 shows an example of a case where a frame image at a time earlier than the target frame image is selected. In this case, the selection unit 12A uses the frame image thus selected and the frame image of interest as a pair of motion estimation frame images.
  • the number of images to be processed by the motion estimation unit 12 and the image generation unit 13 is reduced, so that high-speed processing can be realized.
  • this selection method is based on the assumption that corresponding points can be detected in the frame image of interest.
  • the first estimation unit 12B estimates pixel motion caused by camera or subject motion between a pair of motion estimation frame images. Motion estimation is performed on a combination (pair) of any two frame images of the motion estimation frame images. The first estimation unit 12B performs motion estimation on at least one set of one or a plurality of pairs.
  • the first estimation unit 12B performs motion estimation on a pair of two frame images selected one by one from before and after the target frame image.
  • the first estimation unit 12B may perform motion estimation on a pair composed of the target frame image and one of the frame images selected from before and after.
  • the first estimation unit 12B uses a rectangular area of the target frame image and a plurality of frame images selected from before and after the target frame image. The luminance of the rectangular area is compared with the rectangular area at the same position. Then, the first estimation unit 12B detects a region where the change rate of the luminance of the rectangular region exceeds the threshold value ⁇ . The first estimator 12B makes a pair of frame images including a region having a common region where the rate of change exceeds the threshold ⁇ , and moves with respect to the common region (region surrounded by a dotted line in FIG. 7) of each pair. Estimate.
  • the threshold value ⁇ may be a preset value, but an appropriate value may be dynamically set so that motion estimation can be performed in a certain area.
  • the first estimation unit 12B uses the determination flag between frame images other than the target frame image input from the determination unit 11, and the frame image in which the determination flag between the frame images is “0”. Motion estimation may be performed on a pair of
  • the first estimation unit 12B performs motion estimation on a pair of a frame and a target frame image selected from either one before or after the target frame image.
  • the motion of the image due to the motion of the camera can be expressed by affine transformation between a pair of motion estimation frame images because of the global motion of the screen.
  • Affine transformation is a geometric transformation that combines translation between two images and linear transformation (enlargement / reduction, rotation, skew).
  • Equation (6) The linear transformation matrix of Equation (6) is obtained by QR decomposition.
  • equation (6) can be expressed as equation (7).
  • corresponding points on the image I ′ are detected for three or more pixels on the image I, and each coordinate is expressed by an expression ( It can be calculated by substituting in 7).
  • the first estimation unit 12B can detect corresponding points by the following method, for example.
  • the first estimation unit 12B calculates an optical flow for the pixel P on the image I, and sets the pixel P ′ to which the pixel P is moved as a corresponding point.
  • a method based on the Lucas-Kanade method or the Horn-Schunck method can be cited.
  • the Lucas-Kanade method is a method for calculating the amount of movement of an image based on a constraint condition in which pixel values are approximately the same before and after movement (Non-Patent Document 3).
  • the Horn-Schunck method is a method for calculating the amount of movement of an image by minimizing the error function of the entire image while taking into account the smoothness between adjacent optical flows (Non-Patent Document 4).
  • the first estimation unit 12B specifies the region R ′ on the image I ′ corresponding to the region R on the image I, and the corresponding point of the pixel P corresponding to the center coordinate of the region R corresponds to the center coordinate of the region R ′. It is assumed that the pixel P ′ to be used.
  • the regions R and R ′ may be rectangular regions obtained by dividing the images I and I ′ into a grid having a predetermined size, or may be clusters generated by clustering pixels based on image features such as color and texture. May be.
  • the first estimation unit 12B can detect the region R ′ by template matching using the region R as a template.
  • the first estimation unit 12B uses an SSD (Sum of Squared Difference), SAD (Sum of Absolute Difference), and normalized cross-correlation (ZNCC: Zero-mean Normalized) as a similarity index used for template matching.
  • SSD SSD
  • SAD Sud of Absolute Difference
  • ZNCC normalized cross-correlation
  • Cross-Correlation may be used.
  • the normalized cross-correlation (R ZNCC ) is calculated based on the average values (T ave and) from the template and image luminance values (T (i, j) and I (i, j)) as shown in Equation (8).
  • the similarity can be evaluated stably even if there is a variation in brightness. Therefore, by using normalized cross-correlation, the first estimation unit 12B uses another index even when there is a difference in luminance between a pair of motion estimation frame images due to the influence of flash light. The region R ′ can be detected more stably.
  • the first estimation unit 12B may detect the pixel P ′ corresponding to the corresponding point of the pixel P corresponding to the center coordinate of the region R using the optical flow. For example, the first estimation unit 12B uses the representative value (weighted average value or median value) of the optical flow estimated for each pixel in the region R as the movement amount of the region R, and moves the pixel P by the movement amount of the region R. Let the previous pixel P ′ be a corresponding point.
  • the first estimation unit 12B extracts the pixel P corresponding to the feature point from the image I, and sets the pixel P ′ corresponding to the movement destination of the pixel p of the image I ′ as the corresponding point.
  • the first estimation unit 12B may use, for example, a corner point detected by a Harris corner detection algorithm as a feature point. Harris's corner detection algorithm is based on the knowledge that “the first differential value (difference) is large only in one direction at the point on the edge, and the first differential value is large in multiple directions at the point on the corner”. This is an algorithm for extracting a point having a large positive maximum value of the represented Harris operator dst (x, y).
  • fx and fy mean primary differential values (differences) in the x and y directions, respectively.
  • G ⁇ means smoothing by a Gaussian distribution with a standard deviation ⁇ .
  • k is a constant, and a value from 0.04 to 0.15 is empirically used.
  • the first estimation unit 12B may identify the corresponding point based on the optical flow detected at the feature point.
  • the first estimation unit 12B has an image feature whose image feature value (for example, SIFT (Scale-Invariant Feature Transform) feature value) extracted from an image patch including a certain feature point of the image I is an image patch of the image I ′.
  • the center of the image patch may be set as the corresponding point p ′ when it is similar to the image feature amount extracted from.
  • the first estimator 12B may calculate the affine transformation parameters based on the three reliable combinations of corresponding points among the corresponding points detected using the above method. You may calculate by the least squares method based on the combination of the above corresponding points. Alternatively, the first estimation unit 12B may calculate the affine transformation parameters using a robust estimation method such as RANSAC (RANdom SAmple Consensus). RANSAC calculates three tentative affine transformation parameters by randomly selecting from three combinations of corresponding points, and when there are many combinations that correspond to the tentative affine transformation parameters among other combinations of corresponding points, This is a method in which an affine transformation parameter is adopted as a true affine transformation parameter.
  • RANSAC Random SAmple Consensus
  • the first estimation unit 12B may exclude a specific image region from the calculation target of the affine transformation parameter.
  • Such an image area has a corresponding point detection accuracy such as an edge portion of an image that is likely to be out of the shooting range when the camera moves or a flat portion with a small luminance difference from adjacent pixels. It is a known image area that is low.
  • the pixel value of such an image area changes due to factors other than the movement of the camera, such as an area in the center of the screen where a moving subject is highly likely to be reflected, or a portion that receives a fixed illumination that changes color. It is an image area.
  • the combination of (12B-1), (12B-2), (12B-3) and (12A-1), (12A-2), (12A-3) described above is not particularly limited. That is, the first estimation unit 12B performs (12B-1), (12B) on the motion estimation frame image selected by any of the methods (12A-1), (12A-2), and (12A-3). -2) or (12B-3) may be executed.
  • the first estimation unit 12B may use camera motion information acquired by a measuring instrument (gyroscope, depth sensor, etc.) mounted on the camera in addition to the motion estimation by the image processing described above.
  • Second estimation unit 12C The second estimation unit 12C detects the subject region from one of the pair of motion estimation frame images and estimates the corresponding region (the region corresponding to the subject region) from the other for the motion of the image caused by the motion of the subject. Ask for. Alternatively, the second estimation unit 12C generates a converted image by performing affine transformation on one or both of the pair of motion estimation frame images, and from one of the pair of motion estimation frame images or the converted image thereof. A subject area may be detected. In this case, the second estimator 12C may determine the motion of the image due to the motion of the subject by estimating the other frame image of the pair of motion estimation frame images or the corresponding region of the converted image. .
  • the second estimation unit 12C detects the pair of the subject area and the corresponding area by subtracting the moving amount of the image caused by the camera movement based on the affine transformation parameter and the motion estimation frame image pair. Based on this pair, the second estimation unit 12C estimates the amount of image movement caused by the movement of the subject.
  • Examples of the subject area detection method include the following methods.
  • the second estimation unit 12C detects an image (a set of pixels) that moves differently from the movement amount estimated by the affine transformation parameter from one of the pair of motion estimation frame images as a subject area.
  • the second estimation unit 12C uses the equation (7) to calculate the image P from the image I for the pixel P of the image I based on the affine transformation parameters calculated between the image I and the image I ′.
  • a prediction vector (u, v) between I ′ is calculated.
  • the second estimating unit 12C selects the pixel P as a candidate point when the difference between the vectors (x′ ⁇ x, y′ ⁇ y) and (u, v) between the pixel P and the pixel P ′ is equal to or greater than a certain value.
  • calculating the vector difference means subtracting the amount of movement of the image due to the movement of the camera.
  • the second estimation unit 12C detects the set of candidate points as the subject area of the image I.
  • the second estimation unit 12C calculates a difference between a converted image generated by affine transformation of one and a converted image generated by affine transformation (inverse transformation) of the other frame image. Is detected as a subject area from both converted images.
  • the second estimation unit 12C predicts from the image I at an arbitrary time t based on the affine transformation parameters calculated between the image I and the image I ′ using the equation (7).
  • An image Ip is generated.
  • the second estimation unit 12C generates a predicted image I p ′ at time t from the image I ′ based on the affine transformation parameters calculated between the image I and the image I ′.
  • the second estimation unit 12C calculates a difference between the predicted images I p and I p ′, and detects a set of pixels having an absolute value of the difference equal to or larger than a certain value as a subject area from each of the predicted images I p and I p ′. .
  • the second estimation unit 12C can generate a pixel (x p , y p ) on the predicted image I p by substituting the pixel (x, y) of the image I into Expression (9).
  • the affine transformation parameters between the image I and the image Ip are ( ⁇ p , a p , b p , d p , t px , t py ).
  • ( ⁇ p , a p , b p , d p , t px , t py ) can be calculated by the following relational expression.
  • the affine transformation parameters from the image I to the image I ′ are ( ⁇ , a, b, d, t x , t y ), the time difference between the image I and the image I ′ is T, and the time difference between the image I and the image I p Is T p .
  • the second estimation unit 12C calculates ( ⁇ p , a p , b p , d p , t px , t py ) by weighting the rate of change. May be.
  • the second estimation unit 12C can generate the pixel (x p ′, y p ′) of the predicted image I p ′ by substituting the pixel (x ′, y ′) of the image I ′ into Expression (10).
  • the affine transformation parameters between the image I ′ and the image I p ′ are ( ⁇ p ′, a p ′, b p ′, d p ′, t px ′, t py ′).
  • ( ⁇ p ′, a p ′, b p ′, d p ′, t px ′, t py ′) is obtained by the following relational expression.
  • the parameters of the affine transformation from the image I ′ to the image I are ( ⁇ ′, a ′, b ′, d ′, t x ′, ty ′), the time difference between the image I and the image I ′ is T, and the image It is assumed that the time difference between I and the image I p ′ is T p ′.
  • the second estimation unit 12C may detect a region having a large difference between the converted image generated by affine transformation of one of the pair of motion estimation frame images and the other as a subject region from each of the converted image and the frame image. Good.
  • This detection method is a derivative of (12C-1-2).
  • the second estimation unit 12C predicts from the image I at the time t + k based on the affine transformation parameters calculated between the image I and the image I ′ using Expression (7). An image is generated and a difference from the image I ′ is calculated.
  • the second estimation unit 12C When the second estimation unit 12C detects the subject area, the second estimation unit 12C estimates a corresponding area corresponding to the detected subject area. Examples of a method for estimating the corresponding region of the subject region include the following methods. The second estimation unit 12C may use each method alone or in combination.
  • the second estimation unit 12C calculates an optical flow with respect to the other frame image for all the pixels in the subject area detected from one of the pair of motion estimation frame images, and moves by the weighted average of the optical flow. The tip is detected as a corresponding area.
  • the second estimation unit 12C calculates an optical flow between the other frame image or its converted image for all pixels in the subject area detected from the converted image generated by affine transforming one of the pairs. May be.
  • the second estimation unit 12C may give a high weight to the optical flow of the pixels close to the center of gravity of the subject region as the weight used in the calculation of the weighted average of the optical flow.
  • the second estimation unit 12C may give a high weight to the optical flow of the pixel having a large luminance gradient with respect to the surroundings in the subject area, and the orientation or size variance with the optical flow calculated with the surrounding pixels.
  • a high weight may be given to the optical flow of a pixel with a small.
  • the second estimation unit 12 ⁇ / b> C may exclude a certain number of flows whose magnitude is greater than or equal to a certain value or less as outliers in the optical flow of the subject area, and equally weight the remaining optical flows. .
  • the second estimation unit 12C can estimate the position of the corresponding region based on the optical flow with high reliability by setting the weight based on the luminance gradient and the variance of the direction or the size of the optical flow. is there.
  • the second estimation unit 12C uses, as a template, a subject region detected in one of the pair of motion estimation frame images or a converted image after the affine transformation, and a template for scanning the other frame image or the converted image after the affine transformation. Corresponding regions are detected by matching.
  • the second estimation unit 12C may use any of the indices described in (12B-2) as a similarity index used for template matching, or may use another method.
  • the second estimation unit 12C may detect the corresponding region based on the distance (Euclidean distance) of the image feature amount expressing the color and texture. For example, the second estimation unit 12C extracts an image feature amount from the subject region detected in one of the pair of motion estimation frame images, and the distance from the detected image feature amount for an arbitrary region of the other frame image is A short area may be detected as the corresponding area.
  • the second estimation unit 12C roughly estimates the position of the corresponding region by template matching using the entire subject region as a template, and then searches again around each partial region generated by dividing the subject region, The corresponding area may be determined.
  • the second estimation unit 12C detects a feature point from the subject area detected in one of the pair of motion estimation frame images or the converted image after the affine change, and corresponds to the feature point from the other frame image or the converted image.
  • the optical flow is detected by detecting the point to be performed.
  • the second estimator 12C detects, as a corresponding region, a destination that has moved the subject region by the weighted average of the detected optical flow.
  • the second estimation unit 12C may use, for example, a Harris corner point as a feature point, or may use a feature point detected by another method.
  • the second estimation unit 12C performs (12C-2-) on the subject region detected by any of the methods (12C-1-1), (12C-1-2), and (12C-1-3). 1), (12C-2-2) or (12C-2-3) may be executed.
  • the second estimation unit 12C detects the subject area, and estimates the corresponding area after estimating the corresponding area. Examples of the method for estimating the motion of the subject include the following methods.
  • the second estimation unit 12C detects, from one of the pair of motion estimation frame images, a set of pixels that move differently from the movement amount estimated by the affine transformation parameter as a subject area (12C-1-1)
  • the movement of the subject is estimated by the following method.
  • the second estimation unit 12C calculates the difference between the position information (coordinates) representing the position of the subject area and the position information of the corresponding area, and uses this as a temporary movement vector of the subject area.
  • the second estimation unit 12C calculates a difference between the temporary movement vector and the movement vector of the image due to the camera movement in the pair of motion estimation frame images, and sets the difference as the true movement vector of the subject area between the pair. .
  • (12C-3-2) Motion estimation method 2 When the second estimation unit 12C detects a region having a large difference between the converted images generated by affine transformation of each pair of motion estimation frame images as a subject region from both of the converted images (12C-1-2) ) Estimate the movement of the subject by the following method.
  • the second estimation unit 12C calculates the difference between the position information of the subject area of one converted image and the position information of the corresponding area detected from the other converted image, and the second estimation unit 12C calculates the difference between the pair of motion estimation frame images. Let it be a true movement vector.
  • (12C-3-3) Motion estimation method 3 When the second estimation unit 12C detects an area having a large difference between the converted image generated by affine transformation of one of the pair of motion estimation frame images and the other as the subject area (12C-1-3) The movement of the subject is estimated by the following method.
  • the second estimation unit 12C calculates the difference between the position information of the subject area of one converted image and the position information of the corresponding area detected from the other frame image, and calculates the true of the subject between the pair of motion estimation frame images.
  • the movement vector of This estimation method is a derivative form of (12C-3-2) described above.
  • the motion estimation unit 12 outputs the estimated motion information to the image generation unit 13.
  • the motion information includes at least one of motion information caused by camera motion and motion information caused by subject motion.
  • the motion estimation unit 12 outputs the time of each frame of the pair of motion estimation frame images used for motion estimation and the affine transformation parameters calculated between the pair as motion information resulting from the motion of the camera.
  • the motion estimation unit 12 outputs motion information resulting from the motion of the camera by the number of pairs of motion estimation frame images that have undergone motion estimation.
  • the motion estimator 12 includes each frame image of the motion estimation frame image pair used for estimating the motion of the subject and its time, location information of the subject region, location information of the corresponding region of the subject region, and true information of the subject.
  • the movement vector is output as movement information resulting from the movement of the subject.
  • the position information of the subject region represents one coordinate of the pair of motion estimation frame images.
  • the position information of the corresponding region represents the other coordinate of the pair of motion estimation frame images.
  • the motion estimation unit 12 detects the subject region and estimates the corresponding region of the subject region in the converted image generated by affine transformation of the pair of motion estimation frame images
  • the motion estimation unit 12 is caused by the motion of the subject.
  • the motion information is output as follows.
  • the motion estimation unit 12 includes the time of each frame of the pair of motion estimation frame images used for the motion estimation of the subject, the location information of the subject region, the location information of the corresponding region of the subject region, and the true movement vector of the subject. Is output.
  • the position information of the subject region represents coordinates in a converted image generated by affine transformation of one of the pair of motion estimation frame images.
  • the position information of the corresponding region represents coordinates in a converted image generated by affine transformation of the other of the pair of motion estimation frame images.
  • the motion estimation unit 12 outputs motion information resulting from the motion of the subject for the number of pairs of motion estimation frame images for which motion estimation has been performed.
  • FIG. 8 is a block diagram illustrating a configuration of the image generation unit 13.
  • the image generation unit 13 includes a first correction unit 13A, a second correction unit 13B, and a synthesis unit 13C.
  • the image generation unit 13 receives a plurality of frame images, analysis information from the determination unit 11, and motion information from the motion estimation unit 12 as inputs. When it is determined that the frame of interest is a frame image including a bright area due to the blinking of light, the image generation unit 13 corrects the frame image for motion estimation to an image at the time of the frame of interest image, Output as a corrected frame image.
  • the first correction unit 13A first generates a first corrected image by correcting the motion of the camera for each motion estimation frame image.
  • the second correction unit 13B generates a second corrected image by correcting the motion of the subject for each motion estimation frame image.
  • the synthesizer 13C generates a second corrected image for each motion estimation frame image, and generates a corrected frame image by combining them.
  • the first correction unit 13A corrects the camera motion by, for example, the following method based on the image data of the pair of motion estimation frame images and the affine transformation parameters calculated between the pair.
  • the first correction unit 13A determines that there is no camera movement when each value of the affine transformation parameter is smaller than a preset threshold value, and does not need to correct the camera movement. In this case, the first correction unit 13A regards the uncorrected motion estimation frame image as the first corrected image.
  • the first correcting unit 13A selects the first and second frame images that are closest to the target frame image and do not include the bright region as the motion estimation frame images (12A-1) by the following method. A corrected image is generated. The first correction unit 13A uses the affine transformation parameters calculated between the two selected frame images to generate correction frame images from these frame images, respectively.
  • the first correction unit 13A uses one of the motion estimation frame images as the image I and the other as the image I ′.
  • the predicted images Ip and Ip ′ at the time t of the image are generated as the first corrected image.
  • the first correction unit 13A When a plurality of frame images are selected as the motion estimation frame images from before and after the attention frame image (12A-2), the first correction unit 13A generates the first correction image by the following method.
  • the first correction unit 13A generates a first correction image from each pair based on each affine transformation parameter calculated from a plurality of pairs of motion estimation frame images.
  • the first correction unit 13A uses one of the motion estimation frames as an image I and the other as an image I ′. Predicted images I p and I p ′ at time t are generated as first corrected images. For example, as illustrated in FIG. 7, when the first correction unit 13A selects two frames before and after the target frame image and performs motion estimation for two pairs, the target frame image generated for each selected frame The four predicted images at the time are set as the first corrected image.
  • the first correction unit 13A performs the first correction by the following method. Generate an image. The first correction unit 13A generates a first correction image from the selected frame image based on the affine transformation parameters calculated between the target frame image and the selected frame image.
  • the first correction unit 13A when the frame image selected as the motion estimation frame image is set as an image I, the first correction unit 13A at the time t of the frame image of interest.
  • the predicted image Ip is generated as the first corrected image.
  • the second correction unit 13 ⁇ / b> B updates the pixel information of the position of the subject in the frame image of interest to update the movement of the subject. to correct.
  • the second correction unit 13B can correct the movement of the subject by the following method.
  • the second correction unit 13B determines that there is no movement of the subject when each value of the true movement vector of the subject is smaller than a preset threshold value, and does not correct the movement of the subject. Also good. In this case, the second correction unit 13B regards the first correction image as the second correction image.
  • the second correction unit 13B Based on the true movement vector of the subject between the pair of motion estimation frame images and the time information of the pair and the attention frame image, the second correction unit 13B and each frame image and the attention frame image of the pair To determine the true movement vector of the subject.
  • the second correction unit 13B uses the pixel value of the subject area specified from the first correction image, and the previous pixel moved by the true movement vector from the coordinates of the subject area specified from the first correction image The value and the pixel value of the coordinates of the subject area specified from the first correction frame are updated. Accordingly, the second correction unit 13B generates a second corrected image.
  • the second correction unit 13B may update the pixel value by replacing the pixel value of the movement destination with the pixel value of the subject area. Further, the second correction unit 13B may replace the pixel value of the movement destination with a weighted average value of the pixel value and the pixel value of the subject area, or the pixel value of the movement destination may be a pixel value around the movement destination. And a weighted average value based on pixel values of the subject area.
  • the second correction unit 13B may replace the pixel value of the coordinates of the subject area with the previous pixel value moved by the inverse vector of the true movement vector.
  • the second correction unit 13B may replace the pixel value of the coordinates of the subject area with a weighted average value with the previous pixel value moved by the inverse vector of the true movement vector, or the inverse vector of the true movement vector.
  • the pixel value may be replaced with the weighted average value of the previous pixel value and its surrounding pixels.
  • the true movement vector of the subject between each frame image of the pair of frame images for motion estimation and the target frame image is obtained by the following equation.
  • the true movement vector of the subject area between the frame images I1 and I2 constituting the pair of motion estimation frame images is V
  • the times of the frame images I1 and I2 are T1 and T2, respectively
  • the time of the frame of interest is Let T3 (T1 ⁇ T3 ⁇ T2).
  • the second correction unit 13B determines that the pixel of the first correction image corresponding to the pixel determined to be the subject area in the motion estimation frame image is the pixel of the subject area, thereby determining the first correction image.
  • the subject image can be specified from the above.
  • the combining unit 13C can generate a corrected frame image by combining a plurality of second corrected images.
  • the combining unit 13C is corrected frame image I c can be generated by equation (13).
  • the number of second correction images is N
  • the weight is wi.
  • the weight wi is larger as the absolute value of Di (
  • the combining unit 13C may calculate wi based on a function that linearly increases as
  • the image synthesizing unit 14 synthesizes the frame image of interest and the correction frame image to generate and output a frame image (hereinafter referred to as “output frame image”) in which blinking due to flash or the like is suppressed.
  • the image composition unit 14 calculates a composition ratio in each pixel, and generates an output image by composition processing. . In other cases, the image composition unit 14 uses the input frame image of interest as an output frame image as it is. When the composition ratio u (x, y) at the target pixel I t (x, y) at the position (x, y) is given, the image composition unit 14 outputs the value I out (x, y) of the output frame image at the same position. y) is calculated as shown in equation (14).
  • the image composition unit 14 can calculate the composition ratio using the change rate of the local area luminance between the target frame image and the corrected frame image.
  • the image composition unit 14 calculates the local region luminance change rate r t-es between the frame image of interest and the corrected frame image using a method similar to the method in which the determination unit 11 calculates the local region luminance change rate. can do.
  • the image composition unit 14 uses the composition ratio u (x, y) at the target pixel at the position (x, y) as the change rate r t-es (x, y) of the local region luminance at the same position (x, y).
  • the image composition unit 14 calculates the composition ratio u (x, y) so that the change rate of the local area luminance in the output frame image becomes r tar (x, y).
  • the image composition unit 14 may calculate the composition ratio using the change rate of the rectangular area luminance. Specifically, the image composition unit 14 first corresponds to the rectangular region luminance change rate R t-es calculated using the same method as the determination unit 11 and a preset value of R t-es. The composition ratio U for each rectangular area is calculated from the change rate of the luminance of the rectangular area of the output frame image. Next, the image composition unit 14 obtains a composition ratio u for each pixel from the composition ratio U for each rectangular area using linear interpolation or bicubic interpolation.
  • the determination unit 11 determines whether or not the frame image of interest at time t is a frame image including a bright region caused by blinking of light due to flash or the like that may induce a photosensitivity seizure (S11).
  • the motion estimation unit 12 selects a motion estimation frame image from a plurality of frame images including the frame image of interest, and estimates the amount of image movement due to the motion of the camera and the subject between the motion estimation frame images (S12). .
  • the image generation unit 13 uses the camera and subject image between the motion estimation frame image and the target frame image based on the movement amount of the pixel due to the camera and subject motion estimated between the motion estimation frame images. Is estimated. In addition, the image generation unit 13 converts each motion estimation frame image into an image at the time of the target frame image, and generates a corrected frame image by synthesizing the converted images (S13).
  • the image synthesizing unit 14 synthesizes the attention frame image and the correction frame image, and generates and outputs an output frame image in which blinking due to flash or the like is suppressed (S14).
  • the video processing apparatus 100 can generate a natural video in which variation in luminance is suppressed with respect to a video including a large luminance change that may induce a photosensitivity seizure. .
  • the video processing apparatus 100 synthesizes a frame image having no luminance change estimated from other frame images with respect to a target frame image including a region having a large luminance change while changing the weight for each pixel. It is. As a result, the video processing apparatus 100 can correct only an area where there is a large luminance change and restore information lost due to blinking or the like.
  • blinking by flash etc. occurs at a press conference.
  • a subject participant
  • a conference seat sits down, and leaves after the conference.
  • the camera follows the subject as the subject moves. In this case, the shooting range of the camera moves following the subject.
  • the video processing apparatus 100 corrects the image by estimating the movement of the camera and the subject, it can suppress blurring and blurring of the contour and generate a smooth video.
  • the video processing apparatus 100 can be similarly applied to a case where the blinking region is a dark region that is darker than the other frame images by a predetermined level or more (the luminance is reduced) in the frame image of interest.
  • the determination unit 11 determines whether there is an area that is darker than a predetermined level from the frame image at time (t + k) among the plurality of frame images to which the frame image of interest at time t is input. For example, the determination unit 11 uses the preset threshold value ⁇ ′ of the luminance fluctuation rate and the threshold value ⁇ ′ of the area rate to determine the area of the region where the local region luminance change rate r t ⁇ t + k is lower than the threshold value ⁇ ′. Judgment is made based on whether the rate exceeds the threshold ⁇ ′.
  • the determination unit 11 When it is determined that there is a region that is larger and darker than the frame image at the time (t + k) of the input frame image at the time t, the determination unit 11 sets the determination flag flag t-t + k to “ 1 ”. Otherwise, the determination unit 11 may set the determination flag flag t-t + k to “0”. The determination unit 11 calculates a determination flag for the combination of the target frame image and all the other input frame images. The determination unit 11 determines that the target frame image is a frame image including a dark region due to blinking of light when there is a frame image whose determination flag is “1” at each time before and after the target frame image.
  • the determination unit 11 may use a method using a change rate of the luminance of the rectangular area. For example, the determination unit 11 uses the threshold value ⁇ ′ for the luminance fluctuation rate and the threshold value ⁇ ′ for the area ratio, which are set in advance, so that the area ratio of the region where the change rate of the luminance of the rectangular area is lower than the threshold value ⁇ ′ Depending on whether or not it exceeds, “1” or “0” is set to the determination flag flag t-t + k .
  • the video processing apparatus 100 can be similarly applied to a change in saturation such as red flash. Therefore, the above-described embodiment may include a mode in which “luminance” is replaced with “saturation” or “luminance or saturation”.
  • the embodiment according to the present invention can be applied to a video editing system for editing video recorded on a hard disk or the like.
  • the embodiment according to the present invention can be applied to a video camera, a display terminal, and the like by using a frame image held in a memory.
  • the embodiment according to the present invention can be configured by hardware, but can also be realized by a computer program.
  • the video processing apparatus 100 realizes the same functions and operations as those in the above-described embodiment by a processor that operates according to a program stored in the program memory.
  • only a part of the functions can be realized by a computer program.
  • FIG. 11 is a block diagram illustrating a hardware configuration of the computer apparatus 200 that implements the video processing apparatus 100.
  • the computer apparatus 200 includes a CPU (Central Processing Unit) 201, a ROM (Read Only Memory) 202, a RAM (Random Access Memory) 203, a storage device 204, a drive device 205, a communication interface 206, and an input / output interface. 207.
  • the video processing apparatus 100 can be realized by the configuration (or part thereof) shown in FIG.
  • the CPU 201 executes the program 208 using the RAM 203.
  • the program 208 may be stored in the ROM 202.
  • the program 208 may be recorded on a recording medium 209 such as a flash memory and read by the drive device 205 or transmitted from an external device via the network 210.
  • the communication interface 206 exchanges data with an external device via the network 210.
  • the input / output interface 207 exchanges data with peripheral devices (such as an input device and a display device).
  • the communication interface 206 and the input / output interface 207 can function as means for acquiring or outputting data.
  • the video processing apparatus 100 may be configured by a single circuit (such as a processor) or a combination of a plurality of circuits.
  • the circuit here may be either dedicated or general purpose.
  • (Appendix 1) Determination means for determining whether any of a plurality of temporally continuous frame images is a noticed frame image including a blinking region whose luminance or saturation differs by a predetermined level or more with respect to the preceding and following frame images; Based on a pair of frame images selected based on the difference in brightness or saturation from the frame image of interest and the frame images before and after the frame image of interest and / or the movement of the subject Motion estimation means for estimating a second movement amount to be Image generation for generating a correction frame image corresponding to a frame image at the shooting time of the frame image of interest based on the selected pair and the estimated first movement amount and / or second movement amount Means,
  • An image processing apparatus comprising: an image combining unit that combines the frame image of interest and the correction frame image.
  • the motion estimation means includes The video processing apparatus according to claim 1, further comprising selection means for selecting at least one of the pair from a frame image other than the frame image of interest.
  • the motion estimation means includes The video processing according to claim 2, further comprising: first estimation means for calculating a geometric transformation parameter based on a positional relationship between corresponding points or corresponding regions detected between the pair of frame images and estimating the first movement amount. apparatus.
  • the motion estimation means includes A subject area is detected from one frame image of the pair based on the first movement amount, a corresponding area corresponding to the subject area is detected from the other frame image of the pair, and the subject area and the corresponding area are detected.
  • the video processing apparatus further comprising: a second estimation unit that estimates the second movement amount based on the second movement amount.
  • the motion estimation means includes A subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area.
  • the video processing apparatus further comprising: second estimation means for estimating
  • the image generating means includes First correction means for generating a first correction image from each frame image of the pair based on the first movement amount; Second correction means for generating a second corrected image from each of the first corrected images based on the second movement amount;
  • the video processing apparatus according to any one of appendices 1 to 5, further comprising: a combining unit configured to combine each of the second correction frame images.
  • the determination means includes A frame image in which a region having a luminance or saturation change rate greater than or equal to a specified value or less than a specified area is determined to be the target frame image with any other frame image.
  • the image composition means includes The video processing apparatus according to any one of appendices 1 to 7, wherein a composite ratio for combining the frame image of interest and the correction frame image is calculated based on a predetermined function.
  • the image composition means includes As a composition ratio for compositing the attention frame image and the correction frame image, the composition ratio of the correction frame image is large for an area where the rate of change between the attention frame image and the correction frame image is large.
  • the video processing device according to any one of supplementary notes 1 to 8.
  • a subject area is detected from one frame image of the pair based on the first movement amount, a corresponding area corresponding to the subject area is detected from the other frame image of the pair, and the subject area and the corresponding area are detected.
  • a subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area.
  • Appendix 17 The video processing method according to any one of appendices 10 to 16, wherein a synthesis ratio for synthesizing the frame image of interest and the correction frame image is calculated based on a predetermined function.
  • composition ratio of the correction frame image is increased for an area where the rate of change between the attention frame image and the correction frame image is large.
  • Appendix 22 In the estimation process, A subject area is detected from each frame image of the pair by subtracting the first movement amount based on the geometric transformation parameter, and the second pixel movement amount is detected based on the detected subject area.
  • a frame image in which a region in which a change rate of luminance or saturation is greater than or less than a specified value or less than a specified area with another frame image is determined to be the attention frame image.
  • composition ratio of the correction frame image is large for an area where the rate of change between the attention frame image and the correction frame image is large.
  • Selection means for selecting a first frame image and a second frame image from a plurality of temporally continuous frame images; A geometric transformation parameter is calculated based on a positional relationship between corresponding points or corresponding regions detected between the first frame image and the second frame image, and a first movement amount due to camera movement is estimated.
  • First estimating means for: A subject area is detected from the first frame image and the second frame image by subtracting the first movement amount based on the geometric transformation parameter, and a subject is detected based on the detected subject area. And a second estimating means for estimating a second movement amount resulting from the movement of the video processing apparatus.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

La présente invention concerne un dispositif de traitement d'image susceptible de générer une image naturelle dans laquelle le papillotement est réduit. Un dispositif de traitement d'image (100) est pourvu d'une unité de détermination (11), d'une unité d'estimation de mouvement (12), d'une unité de génération d'image (13), et d'une unité de synthèse d'image (14). L'unité de détermination (11) détermine laquelle d'une pluralité d'images de trames consécutives dans le temps est une image de trame cible comprenant une région de papillotement dans laquelle la luminosité ou la chrominance est différente de celle de l'image de trame précédente ou suivante par un niveau prédéterminé ou supérieur. L'unité d'estimation de mouvement (12) estime une première grandeur de mouvement provoquée par un mouvement d'une caméra et/ou d'une seconde grandeur de mouvement provoquée par un mouvement d'un sujet sur la base d'une paire d'images de trames sélectionnées sur la base d'une différence de luminance ou de chrominance entre l'image de trame cible et l'image de trame précédente ou suivante. Sur la base de la paire sélectionnée et de la première grandeur de mouvement estimé et/ou de la seconde grandeur de mouvement estimé, l'unité de génération d'image (13) génère une image de trame de correction correspondant à une image de trame au moment où l'image de trame cible est capturée. L'unité de synthèse d'image (14) synthétise l'image de trame cible et l'image de trame de correction.
PCT/JP2016/000013 2015-01-06 2016-01-05 Dispositif de traitement d'images, procédé de traitement d'images et support d'enregistrement de programme WO2016111239A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2016568360A JP6708131B2 (ja) 2015-01-06 2016-01-05 映像処理装置、映像処理方法及びプログラム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2015-000630 2015-01-06
JP2015000630 2015-01-06

Publications (1)

Publication Number Publication Date
WO2016111239A1 true WO2016111239A1 (fr) 2016-07-14

Family

ID=56355932

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/000013 WO2016111239A1 (fr) 2015-01-06 2016-01-05 Dispositif de traitement d'images, procédé de traitement d'images et support d'enregistrement de programme

Country Status (2)

Country Link
JP (1) JP6708131B2 (fr)
WO (1) WO2016111239A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020059425A1 (fr) * 2018-09-18 2020-03-26 株式会社日立国際電気 Dispositif d'imagerie, procédé de traitement d'image et programme
JP7538583B2 (ja) 2022-06-08 2024-08-22 パナソニックオートモーティブシステムズ株式会社 検知システムおよび検知方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09224250A (ja) * 1996-02-16 1997-08-26 Nippon Hoso Kyokai <Nhk> レベル変動検出装置および画質改善装置
JP2000069325A (ja) * 1998-08-26 2000-03-03 Fujitsu Ltd 画像表示制御装置及び記録媒体
JP2007193192A (ja) * 2006-01-20 2007-08-02 Nippon Hoso Kyokai <Nhk> 映像分析装置、視覚刺激危険度判定プログラム及び映像分析システム
JP2010141486A (ja) * 2008-12-10 2010-06-24 Fujifilm Corp 画像合成装置、画像合成方法および画像合成プログラム

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09224250A (ja) * 1996-02-16 1997-08-26 Nippon Hoso Kyokai <Nhk> レベル変動検出装置および画質改善装置
JP2000069325A (ja) * 1998-08-26 2000-03-03 Fujitsu Ltd 画像表示制御装置及び記録媒体
JP2007193192A (ja) * 2006-01-20 2007-08-02 Nippon Hoso Kyokai <Nhk> 映像分析装置、視覚刺激危険度判定プログラム及び映像分析システム
JP2010141486A (ja) * 2008-12-10 2010-06-24 Fujifilm Corp 画像合成装置、画像合成方法および画像合成プログラム

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020059425A1 (fr) * 2018-09-18 2020-03-26 株式会社日立国際電気 Dispositif d'imagerie, procédé de traitement d'image et programme
US11310446B2 (en) 2018-09-18 2022-04-19 Hitachi Kokusai Electric Inc. Imaging device, image processing method, and program
JP7538583B2 (ja) 2022-06-08 2024-08-22 パナソニックオートモーティブシステムズ株式会社 検知システムおよび検知方法

Also Published As

Publication number Publication date
JP6708131B2 (ja) 2020-06-10
JPWO2016111239A1 (ja) 2017-10-19

Similar Documents

Publication Publication Date Title
CN108335279B (zh) 图像融合和hdr成像
US9661239B2 (en) System and method for online processing of video images in real time
TWI767985B (zh) 用於處理影像性質圖的方法及裝置
US9202263B2 (en) System and method for spatio video image enhancement
JP7136080B2 (ja) 撮像装置、および撮像方法、並びに画像処理装置、および画像処理方法
JP4210954B2 (ja) 画像処理方法、画像処理方法のプログラム、画像処理方法のプログラムを記録した記録媒体及び画像処理装置
JP7285791B2 (ja) 画像処理装置、および出力情報制御方法、並びにプログラム
US11910001B2 (en) Real-time image generation in moving scenes
JP2010258710A (ja) 動きベクトル検出装置およびその制御方法、ならびに撮像装置
CN105931213B (zh) 基于边缘检测和帧差法的高动态范围视频去鬼影的方法
KR20150108774A (ko) 비디오 시퀀스를 프로세싱하는 방법, 대응하는 디바이스, 컴퓨터 프로그램 및 비일시적 컴퓨터 판독가능 매체
KR20150145725A (ko) Ldr 비디오 시퀀스의 동적 범위 확장을 위한 방법 및 장치
JP2015082768A (ja) 画像処理装置及び画像処理方法、プログラム、記憶媒体
JP2009135561A (ja) ぶれ検出装置、ぶれ補正装置及び方法
TW200919366A (en) Image generation method and apparatus, program therefor, and storage medium for string the program
JP2018124890A (ja) 画像処理装置、画像処理方法及び画像処理プログラム
JP2018195084A (ja) 画像処理装置及び画像処理方法、プログラム、記憶媒体
KR20200045682A (ko) Hlbp 디스크립터 정보를 이용한 시차 최소화 스티칭 장치 및 방법
JP6365355B2 (ja) 画像生成装置および画像生成方法
JP2018061130A (ja) 画像処理装置、画像処理方法、及びプログラム
Geo et al. Globalflownet: Video stabilization using deep distilled global motion estimates
JP6708131B2 (ja) 映像処理装置、映像処理方法及びプログラム
JP2015138399A (ja) 画像処理装置、画像処理方法、及びコンピュータプログラム
Van Vo et al. High dynamic range video synthesis using superpixel-based illuminance-invariant motion estimation
JP6582994B2 (ja) 画像処理装置、画像処理方法及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16734981

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016568360

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16734981

Country of ref document: EP

Kind code of ref document: A1