CN104469179B

CN104469179B - A kind of method being attached to dynamic picture in mobile video

Info

Publication number: CN104469179B
Application number: CN201410803625.1A
Authority: CN
Inventors: 王强宇
Original assignee: HANGZHOU DUANQU NETWORK MEDIA TECHNOLOGY Co Ltd
Current assignee: Ali cloud computing Co., Ltd.
Priority date: 2014-12-22
Filing date: 2014-12-22
Publication date: 2017-08-04
Anticipated expiration: 2034-12-22
Also published as: CN104469179A

Abstract

The invention discloses a kind of method being attached to dynamic picture in mobile video, comprise the following steps：Video and the dynamic picture that is made up of some png sequence frames pictures are obtained, dynamic picture, which is divided into, face tracking, editable word and can not follow the trail of face and editor's class of word three；Face datection is carried out to the whole frame pictures for obtaining video；Some dynamic picture is selected, face judgement is carried out to the present frame picture of video；Whether there is the display of the type progress dynamic picture of face and dynamic picture according to previous frame picture.The present invention adds dynamic picture after video completes to shoot, and this rearmounted addition dynamic picture method can accomplish free adjustment addition dynamic picture into any picture, while not adding the limitation of motion picture number；With face tracking technology, addition dynamic picture is set more to facilitate.

Description

A kind of method being attached to dynamic picture in mobile video

Technical field

The present invention relates to face tracking technology, more particularly to a kind of method being attached to dynamic picture in mobile video.

Background technology

At present, known dynamic picture is added or incorporated into processing scheme in mobile video all in shooting the video stage Shi Tianjia dynamic pictures.Dynamic picture has first been added before shooting, has then been shot, such as section, which is shot, terminates so animation It will terminate.This addition manner is relatively simple, and with limitation, dynamic picture can not be carried out during video is shot Corresponding operation, such as rotate and scale dynamic picture, can not freely adjust the duration that dynamic picture is shown.This scheme institute The addition dynamic picture and the enjoyment of shooting video brought will be substantially reduced.The emergence of smart mobile phone, high-definition camera is recorded The factor such as convenient drastically increases the demand that user shoots video, and the enjoyment for how increasing video needs us conscientious Thinking.

The content of the invention

It is an object of the invention to for it is existing addition dynamic picture to mobile phone vision operation scheme deficiency there is provided A kind of method being attached to dynamic picture in mobile video, dynamic picture is added after video completes to shoot, this rearmounted to add Plus dynamic picture method can accomplish free adjustment addition dynamic picture into any picture, while not adding motion picture Several limitations.

The purpose of the present invention is achieved through the following technical solutions：Dynamic picture is attached in mobile video by one kind Method, comprise the following steps：

（1）The acquisition of video and dynamic picture

The video carries camera by calling mobile phone and shot or by accessing the acquisition of system photograph album；

The dynamic picture is made up of 3-30 png sequence frame pictures, one json allocation lists of each dynamic picture correspondence； The json allocation lists include following information：The sequence number id of dynamic picture, title n, duration du, type type, display location are sat Mark and copy editor's information；Wherein, type=0 represents the dynamic picture of editable word, and type=1 represents face tracking Dynamic Graph Piece, type=2, which are represented, can not follow the trail of the dynamic picture of face and editor's word；

（2）The acquisition of video thumbnails and show：Using AVFoundation frameworks by step（1）The video of acquisition turns Change frame picture into, divide N sections equally according to frame number, the first pictures of each segmented node as video thumbnail in sequence according to Secondary arrangement；

（3）To step（1）The video of acquisition is scanned：Using CoreImage CIDetector or use The whole frame pictures for obtaining video are carried out Face datection by the HaarCascadeClassifier in OpenCV；If detection Human face characteristic point and coordinate information are then preserved to face, data are not otherwise preserved；

（4）Addition and display of the dynamic picture on video, specifically include following sub-step：

（4.1）Pass through player step display（1）The a certain frame of the video of acquisition；

（4.2）Some dynamic picture is selected, face judgement is carried out to the present frame picture of video；

If face is shown, then call the json allocation lists of the dynamic picture to judge whether type is 1, i.e., whether be people Face follows the trail of dynamic picture；

If type=1, with reference to step（3）The human face characteristic point and coordinate information of the present frame of the video of acquisition, dynamic Display location coordinate information in the json allocation lists of picture is calculated, and obtains the people that dynamic picture central point places present frame Coordinate at face, and be shown on relevant position；

If face is shown but type ≠ 1, or show that dynamic picture is placed in json allocation lists without face Initial position in the coordinate information of display location；

If type=0, copy editor's information in json allocation lists is called, by text importing on dynamic picture；

（5）The editor of dynamic picture, specifically includes following sub-step：

（5.1）The start picture that adjustment dynamic picture is shown in video：Pass through step（2）The thumbnail selection video of generation A certain frame picture, dynamic picture will the frame as starting point play dynamic picture acquiescence display duration；

（5.2）Adjust dynamic picture size and the anglec of rotation：Dynamic picture is pulled to be rotated and scaled；

（5.3）Close dynamic picture and complete the addition of dynamic picture：Dynamic picture is closed, then is removed in current picture The dynamic picture；Complete to newly increase another dynamic picture after the addition of current dynamic picture into video, operating procedure is such as Step（4）；The information of all dynamic pictures is saved in array according to time point after completing addition；

（6）The synthesis of dynamic picture and video：Using OpenGl ES render video frames into context, circulation step （5）The middle array for preserving dynamic picture data, if the sometime point in video with the addition of dynamic picture, according at this Between dynamic picture in the array put data, corresponding motion picture sequence frame is zoomed in and out by matrixing, rotated Context is rendered into, video is handled per frame, the video after output synthesis dynamic picture.

Further, the step（4）In, when the present frame picture of video has face to show, during and type=1, Dynamic Graph The display of piece is specially：Four variables are set in the display location coordinate of json allocation lists：X-axis coordinate fx, Y-axis coordinate fy, Animation width fw and animation height fh；Two of face are as X-axis in connection video present frame picture, vertical connection face face As Y-axis, the crosspoint of XY axles is used as the origin of coordinates；Origin is set as X=1 to the distance of left/right eye, origin to face away from From being set as Y=1；The advance picture by dynamic picture and with face is placed on progress pre-selection positioning, such as Fig. 2 in prototype software or PS It is shown：Dynamic picture is put into the related position acquisition coordinate of face, the center that dynamic picture is calculated after dynamic picture is placed Point coordinates, dynamic picture corresponds to size that face relevant position showed by the distance according to origin to left/right eye Multiple and obtain, animation width fw be origin to left/right eye apart from X multiple, animation height fh according to current fw values and give tacit consent to The ratio of width to height of dynamic picture is obtained.

Further, the step（4）In, as type=0, the display of dynamic picture is specially：In json allocation lists Variable is set in copy editor's information：Word content ptext, font tfont, word anglec of rotation tangle, text input box Center point coordinate value（Tleft, ttop）, text input box wide twidth and high theight, color tRGB, the text of word The time of occurrence point tbegin and end time point tend of word, text importing is carried out according to the data in copy editor's information.

The beneficial effects of the invention are as follows：The present invention adds dynamic picture after video completes to shoot, and this rearmounted addition is dynamic State picture approach can accomplish free adjustment addition dynamic picture into any picture, and addition dynamic picture has controllability, no Limited by video is shot, while not adding the limitation of motion picture number；With face tracking technology, make addition Dynamic Graph Piece more facilitates.

Brief description of the drawings

Fig. 1 is the specific steps flow chart of the present invention；

Fig. 2 is the schematic diagram being pre-positioned to the dynamic picture for following the trail of face.

Embodiment

The present invention is described in further detail below in conjunction with the accompanying drawings.

As shown in figure 1, a kind of method being attached to dynamic picture in mobile video of the present invention, comprises the following steps：

（1）The acquisition of video and dynamic picture

The video of 2-8 seconds durations will be taken in the elaboration of specific implementation step of the present invention for example.Calling mobile phone is carried Camera shoots video, and minimum shoot 2 seconds at most shoots 8 seconds videos, can also access the video in system photograph album, such as video mistake It is long then interception 8 seconds in video.

The dynamic picture constitutes continuous animation, each dynamic picture correspondence one by 3-30 png sequence frame pictures Json allocation lists；The json allocation lists include following information：The sequence number id of dynamic picture, title n, duration du, type Type, display location coordinate and copy editor's information；Wherein, type=0 represents the dynamic picture of editable word, the generation of type=1 Table face tracking dynamic picture, type=2, which are represented, can not follow the trail of the dynamic picture of face and editor's word.

Json allocation lists are as follows：

{

"pid":1, the sequence id of // dynamic picture

"fid": 1,

"du":1.1, the acquiescence duration that // dynamic picture is shown

"type":1, the type 0 of // dynamic picture is that the dynamic picture 1 of editable word is to follow the trail of face table The dynamic picture 2 of feelings is common dynamic picture

"x": 320.0,

"y": 320.0,

"w": 400.0,

"h":120.0, // X Y are the default locations in dynamic picture placement video, and W and H are dynamic pictures Size

"a": 0.0,

"fx": 0.0,

"fy": -0.5,

"fw": 4.55,

"fh":The dynamic picture that 0.0, //fx-fh follow the trail of face needs to get the relevant position for placing face

"n": "haixiu",

"c":9.0, the number of // sequence frame picture

"pText":" Na Ni ", the dynamic picture of // editable word in this way is, it is necessary to fill in ptext to tend data

"tFont":Font in " SentyTEApro ", // text importing video, can be used the network font by authorizing

"tAngle":0.0, // word is placed on the anglec of rotation of dynamic picture

"tLeft":205.0,

"tTop":170.0,

"twidth":210.0,

"tHeight":190.0, //tLeft-tHeight are the center point coordinate and input model for the size that word is inputted Enclose

"tR":255.0,

"tG":255.0,

"tB":255.0, //RGB are the color of word

"tBegin":0.5,

"tEnd":1.45, //tBegin-tEnd are the display time of word

"frameArry": [

{"time":0.0,"pic":0},

{"time":0.1,"pic":1},

{"time":0.2,"pic":2},

{"time":0.3,"pic":3},

{"time":0.4,"pic":4},

{"time":0.5,"pic":4},

{"time":0.6,"pic":4},

{"time":0.7,"pic":5},

{"time":0.8,"pic":6},

{"time":0.9,"pic":7},

{"time":1.0,"pic":8 } // dynamic picture frame number and frame per second

]

}

（2）The acquisition of video thumbnails and show：Using AVFoundation frameworks by step（1）The video of acquisition turns Change frame picture into, or image is read from local file using Video Decoder.Wherein video 30 frame per second, is regarded to acquisition Frequency picture is handled, and is divided equally according to frame number and is taken 8 sections, take each pictures of segmented node first as video thumbnail according to Order is arranged in order.

If type=1, with reference to step（3）The human face characteristic point and coordinate information of the present frame of the video of acquisition, dynamic Display location coordinate information in the json allocation lists of picture is calculated, and obtains the people that dynamic picture central point places present frame Coordinate at face, and be shown on relevant position；The display of dynamic picture is specially：In the display location coordinate of json allocation lists Four variables of middle setting：X-axis coordinate fx, Y-axis coordinate fy, animation width fw and animation height fh；Connect video present frame picture Two of middle face are as X-axis, and vertical connection face face is as Y-axis, and the crosspoint of XY axles is used as the origin of coordinates；Origin is arrived The distance of left/right eye is set as X=1, and the distance of origin to face is set as Y=1；The advance picture by dynamic picture and with face It is placed on progress pre-selection positioning in prototype software or PS：Dynamic picture is put into the related position acquisition coordinate of face, placed active Calculate the center point coordinate of dynamic picture after state picture, dynamic picture corresponds to size that face relevant position showed by root Obtained according to origin to left/right eye apart from multiple, animation width fw is for origin to left/right eye apart from X multiple, animation Height fh is obtained according to the ratio of width to height of current fw values and acquiescence dynamic picture.This way ensure that different faces add different chis Very little dynamic picture also will be, and be the dynamic picture size for meeting the shape of face.When face duration is more than dynamic picture in itself Duration when, only display dynamic picture acquiescence duration；When face duration is less than the duration of dynamic picture in itself, dynamic picture The duration of display is the duration of face, i.e., face is disappeared, and dynamic picture, which also disappears, not to be shown.

If face is shown but type ≠ 1, or show that dynamic picture is placed in json allocation lists without face XY values in the coordinate information of display location are initial position；

If type=0, copy editor's information in json allocation lists is called, by text importing on dynamic picture；It is dynamic The display of state picture is specially：Variable is set in copy editor's information of json allocation lists：Word content ptext, font Tfont, word anglec of rotation tangle, the center point coordinate value of text input box（Tleft, ttop）, text input box width Twidth and high theight, the color tRGB of word, the time of occurrence point tbegin of word and end time point tend, according to Data in copy editor's information carry out text importing.

（5.2）Adjust dynamic picture size and the anglec of rotation：Dynamic picture is pulled to be rotated and scaled（Dynamic picture can To be scaled by XY equal proportions in parameter）.

Description of the invention is provided for the sake of example and description, is not exhaustively or by the present invention to limit In disclosed form.Many algorithms and implementation are obvious for the ordinary skill in the art.Selection and It, in order to more preferably illustrate the principle and practical application of the present invention, and is that one of ordinary skill in the art can that description embodiment, which is, Understand the present invention to design the various embodiments with various modifications suitable for special-purpose.

Claims

1. a kind of method being attached to dynamic picture in mobile video, it is characterised in that comprise the following steps：

（1）The acquisition of video and dynamic picture

The dynamic picture is made up of 3-30 png sequence frame pictures, one json allocation lists of each dynamic picture correspondence；It is described Json allocation lists include following information：The sequence number id of dynamic picture, title n, duration du, type type, display location coordinate and Copy editor's information；Wherein, type=0 represents the dynamic picture of editable word, and type=1 represents face tracking dynamic picture, Type=2, which are represented, can not follow the trail of the dynamic picture of face and editor's word；

（2）The acquisition of video thumbnails and show：Using AVFoundation frameworks by step（1）The Video Quality Metric of acquisition into Frame picture, divides N sections equally, the first pictures of each segmented node are arranged successively in sequence as the thumbnail of video according to frame number Row；

（3）To step（1）The video of acquisition is scanned：Using CoreImage CIDetector or use OpenCV In HaarCascadeClassifier, to obtain video whole frame pictures carry out Face datection；If detecting face Human face characteristic point and coordinate information are then preserved, data are not otherwise preserved；

（4-1）Pass through player step display（1）The a certain frame of the video of acquisition；

（4-2）Some dynamic picture is selected, face judgement is carried out to the present frame picture of video；

If face is shown, then call the json allocation lists of the dynamic picture to judge whether type is 1, i.e., whether be that face is chased after Track dynamic picture；

If type=1, with reference to step（3）Human face characteristic point and coordinate information, the dynamic picture of the present frame of the video of acquisition Json allocation lists in display location coordinate information calculated, obtain dynamic picture central point place present frame face at Coordinate, and be shown on relevant position；

If face is shown but type ≠ 1, or shown without face, dynamic picture is placed in json allocation lists and shown Initial position in location coordinate information；

（5-1）The start picture that adjustment dynamic picture is shown in video：Pass through step（2）Certain of the thumbnail selection video of generation One frame picture, dynamic picture will play the display duration that dynamic picture is given tacit consent in the frame as starting point；

（5-2）Adjust dynamic picture size and the anglec of rotation：Dynamic picture is pulled to be rotated and scaled；

（5-3）Close dynamic picture and complete the addition of dynamic picture：Dynamic picture is closed, then removes this in current picture and moves State picture；Complete to newly increase another dynamic picture after the addition of current dynamic picture into video, operating procedure such as step （4）；The information of all dynamic pictures is saved in array according to time point after completing addition；

（6）The synthesis of dynamic picture and video：Using OpenGl ES render video frames into context, circulation step（5）In The array of dynamic picture data is preserved, if the sometime point in video with the addition of dynamic picture, according at the time point Array in dynamic picture data, corresponding motion picture sequence frame is zoomed in and out by matrixing, rotated rendering To context, video is handled per frame, the video after output synthesis dynamic picture.

2. a kind of method being attached to dynamic picture in mobile video according to claim 1, it is characterised in that the step Suddenly（4）In, when the present frame picture of video has face to show, during and type=1, the display of dynamic picture is specially：Match somebody with somebody in json Put four variables of setting in the display location coordinate of table：X-axis coordinate fx, Y-axis coordinate fy, animation width fw and animation height fh； Two of face are as X-axis in connection video present frame picture, and vertical connection face face is used as Y-axis, and the crosspoint of XY axles is made For the origin of coordinates；Origin is set as X=1 to the distance of left/right eye, and the distance of origin to face is set as Y=1；In advance will dynamic Picture and the picture progress pre-selection positioning with face：Dynamic picture is put into the related position acquisition coordinate of face, placed active Calculate the center point coordinate of dynamic picture after state picture, dynamic picture corresponds to size that face relevant position showed by root Obtained according to origin to left/right eye apart from multiple, animation width fw is for origin to left/right eye apart from X multiple, animation Height fh is obtained according to the ratio of width to height of current fw values and acquiescence dynamic picture.

3. a kind of method being attached to dynamic picture in mobile video according to claim 1, it is characterised in that the step Suddenly（4）In, as type=0, the display of dynamic picture is specially：Variable is set in copy editor's information of json allocation lists： Word content ptext, font tfont, word anglec of rotation tangle, the center point coordinate value of text input box（Tleft, ttop）, the wide twidth and high theight of text input box, the color tRGB of word, the time of occurrence point tbegin of word and End time point tend, text importing is carried out according to the data in copy editor's information.