CN109743589A - Article generation method and device - Google Patents
Article generation method and device Download PDFInfo
- Publication number
- CN109743589A CN109743589A CN201811600339.XA CN201811600339A CN109743589A CN 109743589 A CN109743589 A CN 109743589A CN 201811600339 A CN201811600339 A CN 201811600339A CN 109743589 A CN109743589 A CN 109743589A
- Authority
- CN
- China
- Prior art keywords
- paragraph
- sentence
- adjacent
- words
- time difference
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- User Interface Of Digital Computer (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention proposes a kind of article generation method and device, and wherein method includes: to obtain video and corresponding voice, identifies to voice, obtains each sentence;The characteristic information for obtaining each sentence carries out paragraph division to each sentence according to characteristic information, obtains paragraph sequence;For each paragraph in paragraph sequence, the crucial sentence in paragraph is obtained;The crucial sentence corresponding period is obtained, selects key video sequence frame as the corresponding picture of paragraph out of the period in video corresponding video-frequency band;According to each paragraph and corresponding picture generation article in paragraph sequence, wherein include each paragraph and corresponding picture in article, can effectively embody video content, so that user easily chooses the video of desired viewing, improve video playing efficiency.
Description
Technical field
The present invention relates to technical field of video processing more particularly to a kind of article generation method and devices.
Background technique
Currently, can be analyzed and processed to video before issuing video, select the wherein frame picture in video as view
The thumbnail of frequency, so that user first can understand video content according to thumbnail, and then determine whether to select after issuing video
Watch video.However, the content that thumbnail is shown is less in above scheme, it is difficult to video content is effectively embodied, so that user
It is difficult to choose the video for wanting viewing, can drop by the wayside when choosing the video for being not desired to viewing, be broadcast to reduce video
Put efficiency.
Summary of the invention
The present invention is directed to solve at least some of the technical problems in related technologies.
For this purpose, the first purpose of this invention is to propose a kind of article generation method, regarded in the prior art for solving
Frequency thumbnail is difficult to effectively embody video content, the inefficient problem of video playing.
Second object of the present invention is to propose a kind of article generating means.
Third object of the present invention is to propose another article generating means.
Fourth object of the present invention is to propose a kind of non-transitorycomputer readable storage medium.
5th purpose of the invention is to propose a kind of computer program product.
In order to achieve the above object, first aspect present invention embodiment proposes a kind of article generation method, comprising:
Video and corresponding voice are obtained, the voice is identified, each sentence is obtained;
The characteristic information for obtaining each sentence carries out paragraph division to each sentence according to characteristic information, obtains paragraph sequence
Column;
For each paragraph in the paragraph sequence, the crucial sentence in the paragraph is obtained;
The crucial sentence corresponding period is obtained, is selected in the corresponding video-frequency band of the period described in the video
Key video sequence frame is as the corresponding picture of the paragraph;
According to each paragraph and corresponding picture generation article in the paragraph sequence.
Further, described that the voice is identified, obtain each sentence, comprising:
The voice is identified, each word and the corresponding timestamp of each word are obtained;
Described two phases are calculated according to the corresponding timestamp of described two adjacent words for any two adjacent word
The time difference of adjacent word;
Judge whether the time difference is more than or equal to the first difference threshold;
If the time difference is divided into the same sentence less than the first difference threshold, by described two adjacent words
In;
If the time difference is more than or equal to the first difference threshold, described two adjacent words are divided into different sentences
In, obtain each sentence.
Further, include: the corresponding interlude stamp of sentence in the characteristic information, whether have conjunction in sentence;
It is described that paragraph division is carried out to each sentence according to characteristic information, obtain paragraph sequence, comprising:
For the adjacent sentence of any two, is stabbed according to the corresponding interlude of described two adjacent sentences, calculate described two
The time difference of a adjacent sentence;
Judge whether the time difference is more than or equal in the second difference threshold and described two adjacent sentences rearward
Whether sentence has conjunction;
If the time difference has connection less than the sentence in the second difference threshold or described two adjacent sentences rearward
Described two adjacent sentences are then divided into identical paragraph by word;
If the time difference is more than or equal to the second difference threshold, and the sentence in described two adjacent sentences rearward does not have
Described two adjacent sentences are then divided into different paragraphs, obtain paragraph sequence by conjunction.
Further, the determination method of second difference threshold is,
According to the time difference of the adjacent sentence of any two, time difference set is generated;
According to the time difference set, the standard deviation for determining the time difference set is calculated;
By the product of the standard deviation and predetermined coefficient, it is determined as second difference threshold.
Further, the characteristic information for obtaining each sentence carries out paragraph to each sentence according to characteristic information and draws
Point, after obtaining paragraph sequence, further includes:
For each paragraph in the paragraph sequence, the number of words of the paragraph is obtained;
Judge whether the number of words of the paragraph is less than default number of words threshold value;
If the number of words of the paragraph is less than default number of words threshold value, the paragraph and adjacent segment rearward are dropped into capable conjunction
And until the number of words of the paragraph after merging is more than or equal to default number of words threshold value.
Further, each paragraph in the paragraph sequence, obtains the crucial sentence in the paragraph, wraps
It includes:
Obtain the title of the video;
By all sentences and the title in the paragraph sequence, preset keyword models are inputted, are obtained each
Keyword and corresponding weight generate keyword set;
For each paragraph in the paragraph sequence, according to each sentence query keyword set in the paragraph,
Obtain keyword included in each sentence;
According to the corresponding weight of keyword and keyword included in each sentence, the weight of each sentence is determined;
By the maximum sentence of weight corresponding in the paragraph, the crucial sentence being determined as in the paragraph.
It is further, described to obtain the crucial sentence corresponding period, comprising:
Obtain the corresponding interlude stamp of the crucial sentence;
According to the corresponding interlude stamp of the key sentence and preset threshold, determine that the crucial sentence is corresponding
Period;The start time point of the period is the difference of interlude stamp and the preset threshold, the period
Terminate time point be the interlude stamp with the preset threshold and value.
The article generation method of the embodiment of the present invention identifies voice by obtaining video and corresponding voice,
Obtain each sentence;The characteristic information for obtaining each sentence carries out paragraph division to each sentence according to characteristic information, obtains section
Fall sequence;For each paragraph in paragraph sequence, the crucial sentence in paragraph is obtained;Obtain the crucial sentence corresponding time
Section, selects key video sequence frame as the corresponding picture of paragraph out of the period in video corresponding video-frequency band;According to paragraph sequence
In each paragraph and corresponding picture generate article, wherein in article include each paragraph and corresponding picture, can
Effective embodiment video content improves video playing efficiency so that user easily chooses the video of desired viewing.
In order to achieve the above object, second aspect of the present invention embodiment proposes a kind of article generating means, comprising:
Module is obtained to identify the voice for obtaining video and corresponding voice, obtain each sentence;
Division module carries out paragraph to each sentence according to characteristic information and draws for obtaining the characteristic information of each sentence
Point, obtain paragraph sequence;
The acquisition module is also used to obtain the key in the paragraph for each paragraph in the paragraph sequence
Sentence;
Selecting module, for obtaining the crucial sentence corresponding period, the period described in the video is corresponding
Video-frequency band in select key video sequence frame as the corresponding picture of the paragraph;
Generation module, for according to each paragraph and corresponding picture generation article in the paragraph sequence.
Further, the acquisition module is specifically used for,
The voice is identified, each word and the corresponding timestamp of each word are obtained;
Described two phases are calculated according to the corresponding timestamp of described two adjacent words for any two adjacent word
The time difference of adjacent word;
Judge whether the time difference is more than or equal to the first difference threshold;
If the time difference is divided into the same sentence less than the first difference threshold, by described two adjacent words
In;
If the time difference is more than or equal to the first difference threshold, described two adjacent words are divided into different sentences
In, obtain each sentence.
Further, include: the corresponding interlude stamp of sentence in the characteristic information, whether have conjunction in sentence;
The division module is specifically used for,
For the adjacent sentence of any two, is stabbed according to the corresponding interlude of described two adjacent sentences, calculate described two
The time difference of a adjacent sentence;
Judge whether the time difference is more than or equal in the second difference threshold and described two adjacent sentences rearward
Whether sentence has conjunction;
If the time difference has connection less than the sentence in the second difference threshold or described two adjacent sentences rearward
Described two adjacent sentences are then divided into identical paragraph by word;
If the time difference is more than or equal to the second difference threshold, and the sentence in described two adjacent sentences rearward does not have
Described two adjacent sentences are then divided into different paragraphs, obtain paragraph sequence by conjunction.
Further, the determination method of second difference threshold is,
According to the time difference of the adjacent sentence of any two, time difference set is generated;
According to the time difference set, the standard deviation for determining the time difference set is calculated;
By the product of the standard deviation and predetermined coefficient, it is determined as second difference threshold.
Further, the device further include: judgment module and merging module;
The acquisition module is also used to obtain the number of words of the paragraph for each paragraph in the paragraph sequence;
The judgment module, for judging whether the number of words of the paragraph is less than default number of words threshold value;
The merging module, when being less than default number of words threshold value for the number of words in the paragraph, by the paragraph and rearward
Adjacent segment fall and merge, until the number of words of the paragraph after merging is more than or equal to default number of words threshold value.
Further, the acquisition module is specifically used for,
Obtain the title of the video;
By all sentences and the title in the paragraph sequence, preset keyword models are inputted, are obtained each
Keyword and corresponding weight generate keyword set;
For each paragraph in the paragraph sequence, according to each sentence query keyword set in the paragraph,
Obtain keyword included in each sentence;
According to the corresponding weight of keyword and keyword included in each sentence, the weight of each sentence is determined;
By the maximum sentence of weight corresponding in the paragraph, the crucial sentence being determined as in the paragraph.
Further, the selecting module is specifically used for,
Obtain the corresponding interlude stamp of the crucial sentence;
According to the corresponding interlude stamp of the key sentence and preset threshold, determine that the crucial sentence is corresponding
Period;The start time point of the period is the difference of interlude stamp and the preset threshold, the period
Terminate time point be the interlude stamp with the preset threshold and value.
The article generating means of the embodiment of the present invention identify voice by obtaining video and corresponding voice,
Obtain each sentence;The characteristic information for obtaining each sentence carries out paragraph division to each sentence according to characteristic information, obtains section
Fall sequence;For each paragraph in paragraph sequence, the crucial sentence in paragraph is obtained;Obtain the crucial sentence corresponding time
Section, selects key video sequence frame as the corresponding picture of paragraph out of the period in video corresponding video-frequency band;According to paragraph sequence
In each paragraph and corresponding picture generate article, wherein in article include each paragraph and corresponding picture, can
Effective embodiment video content improves video playing efficiency so that user easily chooses the video of desired viewing.
In order to achieve the above object, third aspect present invention embodiment proposes another article generating means, comprising: storage
Device, processor and storage are on a memory and the computer program that can run on a processor, which is characterized in that the processor
Article generation method as described above is realized when executing described program.
To achieve the goals above, fourth aspect present invention embodiment proposes a kind of computer readable storage medium,
On be stored with computer program, which realizes article generation method as described above when being executed by processor.
To achieve the goals above, fifth aspect present invention embodiment proposes a kind of computer program product, when described
When instruction processing unit in computer program product executes, article generation method as described above is realized.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description
Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments
Obviously and it is readily appreciated that, in which:
Fig. 1 is a kind of flow diagram of article generation method provided in an embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of article generating means provided in an embodiment of the present invention;
Fig. 3 is the structural schematic diagram of another article generating means provided in an embodiment of the present invention;
Fig. 4 is the structural schematic diagram of another article generating means provided in an embodiment of the present invention.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end
Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached
The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.
Below with reference to the accompanying drawings the article generation method and device of the embodiment of the present invention are described.
Fig. 1 is a kind of flow diagram of article generation method provided in an embodiment of the present invention.As shown in Figure 1, this article
Generation method the following steps are included:
S101, video and corresponding voice are obtained, voice is identified, each sentence is obtained.
The executing subject of article generation method provided by the invention is article generating means, and article generating means can be for eventually
The hardware devices such as end equipment, server, or the software to be installed on hardware device.In the present embodiment, video for example can be
Video etc. to be released.
In the present embodiment, in the corresponding voice of video, pause since people speaks to exist, it is general between especially every words
Can exist and pause, therefore, according to the corresponding timestamp of each word, so that it may determine each sentence in voice.It is corresponding, text
Chapter generating means identify that the process for obtaining each sentence is specifically as follows to voice, identify to voice, obtain each
Word and the corresponding timestamp of each word;For any two adjacent word, according to two adjacent words corresponding time
Stamp, calculates the time difference of two adjacent words;Judge whether time difference is more than or equal to the first difference threshold;If time difference
Less than the first difference threshold, then two adjacent words are divided into the same sentence;If it is poor that time difference is more than or equal to first
It is worth threshold value, then two adjacent words is divided into different sentences, obtains each sentence.Wherein, timestamp can be word
Initial time stamp, interlude stamp terminate timestamp.Time difference for example can be 0.2 second, 0.3 second etc..
S102, the characteristic information for obtaining each sentence carry out paragraph division to each sentence according to characteristic information, obtain section
Fall sequence.
It may include: the corresponding interlude stamp of sentence in the present embodiment, in characteristic information, whether have conjunction in sentence
Deng.Wherein, conjunction for example " and ", " still " etc..Corresponding, the process that article generating means execute step 102 specifically may be used
Think, for the adjacent sentence of any two, is stabbed according to the corresponding interlude of two adjacent sentences, calculate two adjacent sentences
Time difference;The sentence for judging whether time difference is more than or equal in the second difference threshold and two adjacent sentences rearward is
It is no to have conjunction;If time difference has conjunction less than the sentence in the second difference threshold or two adjacent sentences rearward,
Two adjacent sentences are divided into identical paragraph;If time difference is more than or equal to the second difference threshold, and two adjacent sentences
In sentence rearward there is no conjunction, then two adjacent sentences are divided into different paragraphs, obtain paragraph sequence.
Wherein, the determination method of the second difference threshold can be, according to the time difference of the adjacent sentence of any two, to generate
Time difference set;According to time difference set, the standard deviation for determining time difference set is calculated;By standard deviation and predetermined coefficient
Product, be determined as the second difference threshold.Wherein, the second difference threshold is greater than the first difference threshold.Wherein, predetermined coefficient is for example
It can be N, the value of N can be 2 etc..
Further, on the basis of the above embodiments, due to generally having a certain number of numbers of words in paragraph,
In order to more accurately divide paragraph, after step 102, above-mentioned method can be the following steps are included: in paragraph sequence
Each paragraph, obtain the number of words of paragraph;Judge whether the number of words of paragraph is less than default number of words threshold value;If the number of words of paragraph is less than
Default number of words threshold value, then fall with adjacent segment rearward by paragraph and merge, until the number of words of the paragraph after merging is more than or equal to
Default number of words threshold value.
For example, first paragraph and second paragraph are closed if the number of words of first paragraph is less than default number of words threshold value
It and is a paragraph;Judge whether the number of words of the paragraph after merging is less than default number of words threshold value, if being less than default number of words threshold value,
By after merging paragraph and third paragraph merge, the paragraph after being merged.At this point, if paragraph after reconsolidating
Number of words is more than or equal to default number of words threshold value, then stops operating the paragraph after merging;Then the 4th paragraph, judgement are obtained
Whether the number of words of the 4th paragraph is less than default number of words threshold value.
S103, for each paragraph in paragraph sequence, obtain the crucial sentence in paragraph.
In the present embodiment, crucial sentence in paragraph, for the sentence for best embodying paragraph central idea in paragraph.Article is raw
It is specifically as follows at the process that device executes step 103, obtains the title of video;By all sentences and mark in paragraph sequence
Topic, inputs preset keyword models, obtains each keyword and corresponding weight, generates keyword set;For paragraph
Each paragraph in sequence obtains pass included in each sentence according to each sentence query keyword set in paragraph
Keyword;According to the corresponding weight of keyword and keyword included in each sentence, the weight of each sentence is determined;It will
The corresponding maximum sentence of weight, the crucial sentence being determined as in paragraph in paragraph.
Wherein, keyword can be more for frequency of occurrence in all sentences, or embodies the word of all paragraph central ideas
Language.Wherein, keyword models can be neural network model etc., and keyword models can be according to training text and training text
Corresponding keyword set is trained.
In the present embodiment, according to the corresponding weight of keyword and keyword included in each sentence, determine each
The process of the weight of sentence is specifically as follows, and for each sentence, obtains in the sentence included keyword, keyword
The corresponding weight of frequency of occurrence, keyword;The frequency of occurrence of each keyword and the product of weight are calculated, numerical value is obtained, by institute
Including the numerical value of each keyword sum up, obtain the weight of sentence.
S104, the crucial sentence corresponding period is obtained, crucial view is selected out of the period in video corresponding video-frequency band
Frequency frame is as the corresponding picture of paragraph.
In the present embodiment, the process that article generating means obtain the crucial sentence corresponding period is specifically as follows, and obtains
The corresponding interlude stamp of crucial sentence;According to the corresponding interlude stamp of crucial sentence and preset threshold, critical sentence is determined
The son corresponding period;The start time point of period is interlude stamp and the difference of preset threshold, when the termination of period
Between point be interlude stamp with preset threshold and value.
Wherein, the crucial sentence corresponding period can be located at the start time point according to crucial sentence and terminate the time
In period determined by point.In the present embodiment, article generating means can select most complete video frame out of video-frequency band, make
For key video sequence frame.
S105, according in paragraph sequence each paragraph and corresponding picture generate article.
It may include the interior of first paragraph in article for including 3 paragraphs in paragraph sequence in the present embodiment
Appearance, the corresponding picture of first paragraph, the content of second paragraph, the corresponding picture of second paragraph, third paragraph it is interior
Hold, the corresponding picture of third paragraph.
In the present embodiment, after generating article, the corresponding chained address of article is can be generated in article generating means.It is regarded in publication
When frequency, the corresponding chained address of article is shown on the publication page of video, so that user can first lead to before watching video
Chained address browsing article is crossed, determines whether video is that oneself wants the video of viewing, and then determines whether to watch according to article
Video etc..
In addition, the corresponding chained address of video can be generated in article generating means after generating article.By the corresponding chain of video
Ground connection location is shown on the page where article, so that user is after watching article, to the interested situation of article content
Under, chained address can be clicked directly on, to watch video.
The article generation method of the embodiment of the present invention identifies voice by obtaining video and corresponding voice,
Obtain each sentence;The characteristic information for obtaining each sentence carries out paragraph division to each sentence according to characteristic information, obtains section
Fall sequence;For each paragraph in paragraph sequence, the crucial sentence in paragraph is obtained;Obtain the crucial sentence corresponding time
Section, selects key video sequence frame as the corresponding picture of paragraph out of the period in video corresponding video-frequency band;According to paragraph sequence
In each paragraph and corresponding picture generate article, wherein in article include each paragraph and corresponding picture, can
Effective embodiment video content improves video playing efficiency so that user easily chooses the video of desired viewing.
Fig. 2 is a kind of structural schematic diagram of article generating means provided in an embodiment of the present invention.As shown in Figure 2, comprising: obtain
Modulus block 21, division module 22, selecting module 23 and generation module 24.
Wherein, module 21 is obtained, for obtaining video and corresponding voice, the voice is identified, is obtained each
A sentence;
Division module 22 carries out paragraph to each sentence according to characteristic information for obtaining the characteristic information of each sentence
It divides, obtains paragraph sequence;
The acquisition module 21 is also used to obtain the pass in the paragraph for each paragraph in the paragraph sequence
Key sentence;
Selecting module 23, for obtaining the crucial sentence corresponding period, the period pair described in the video
Select key video sequence frame as the corresponding picture of the paragraph in the video-frequency band answered;
Generation module 24, for according to each paragraph and corresponding picture generation article in the paragraph sequence.
Article generating means provided by the invention can be the hardware devices such as terminal device, server, or set for hardware
The software of standby upper installation.In the present embodiment, video for example can be video etc. to be released.
In the present embodiment, in the corresponding voice of video, pause since people speaks to exist, it is general between especially every words
Can exist and pause, therefore, according to the corresponding timestamp of each word, so that it may determine each sentence in voice.It is corresponding, it obtains
Modulus block 21 specifically can be used for, and identify to voice, obtain each word and the corresponding timestamp of each word;For
Any two adjacent word calculates the time difference of two adjacent words according to the corresponding timestamp of two adjacent words;Judgement
Whether time difference is more than or equal to the first difference threshold;If time difference is less than the first difference threshold, by two adjacent words
It is divided into the same sentence;If time difference is more than or equal to the first difference threshold, two adjacent words are divided into difference
In sentence, each sentence is obtained.Wherein, timestamp can be the initial time stamp of word, interlude stamp or termination time
Stamp.Time difference for example can be 0.2 second, 0.3 second etc..
It may include: the corresponding interlude stamp of sentence in the present embodiment, in characteristic information, whether have conjunction in sentence
Deng.Wherein, conjunction for example " and ", " still " etc..Corresponding, division module 22 specifically can be used for, for any two
Adjacent sentence stabs according to the corresponding interlude of two adjacent sentences, calculates the time difference of two adjacent sentences;Judge the time
Whether the sentence whether difference is more than or equal in the second difference threshold and two adjacent sentences rearward has conjunction;If the time
Difference has conjunction less than the sentence in the second difference threshold or two adjacent sentences rearward, then draws two adjacent sentences
It assigns in identical paragraph;If time difference is more than or equal to the second difference threshold, and the sentence in two adjacent sentences rearward does not have
Two adjacent sentences are then divided into different paragraphs, obtain paragraph sequence by conjunction.
Wherein, the determination method of the second difference threshold can be, according to the time difference of the adjacent sentence of any two, to generate
Time difference set;According to time difference set, the standard deviation for determining time difference set is calculated;By standard deviation and predetermined coefficient
Product, be determined as the second difference threshold.Wherein, the second difference threshold is greater than the first difference threshold.
Further, on the basis of the above embodiments, due to generally having a certain number of numbers of words in paragraph,
In order to more accurately divide paragraph, in conjunction with reference Fig. 3, the device can also include: judgment module 25 and merging module 26;
Wherein, the acquisition module 21 is also used to obtain the paragraph for each paragraph in the paragraph sequence
Number of words;
The judgment module 25, for judging whether the number of words of the paragraph is less than default number of words threshold value;
The merging module 26, when being less than default number of words threshold value for the number of words in the paragraph, by the paragraph with lean on
Adjacent segment afterwards, which is fallen, to be merged, until the number of words of the paragraph after merging is more than or equal to default number of words threshold value.
For example, first paragraph and second paragraph are closed if the number of words of first paragraph is less than default number of words threshold value
It and is a paragraph;Judge whether the number of words of the paragraph after merging is less than default number of words threshold value, if being less than default number of words threshold value,
By after merging paragraph and third paragraph merge, the paragraph after being merged.At this point, if paragraph after reconsolidating
Number of words is more than or equal to default number of words threshold value, then stops operating the paragraph after merging;Then the 4th paragraph, judgement are obtained
Whether the number of words of the 4th paragraph is less than default number of words threshold value.
In the present embodiment, crucial sentence in paragraph, for the sentence for best embodying paragraph central idea in paragraph.It is corresponding
, obtaining module 21 specifically can be used for, and obtain the title of video;By all sentences and title in paragraph sequence, input
Preset keyword models obtain each keyword and corresponding weight, generate keyword set;For in paragraph sequence
Each paragraph obtains keyword included in each sentence according to each sentence query keyword set in paragraph;According to
The included corresponding weight of keyword and keyword, determines the weight of each sentence in each sentence;It will be corresponding in paragraph
The maximum sentence of weight, the crucial sentence being determined as in paragraph.
Wherein, keyword can be more for frequency of occurrence in all sentences, or embodies the word of all paragraph central ideas
Language.Wherein, keyword models can be neural network model etc., and keyword models can be according to training text and training text
Corresponding keyword set is trained.
In the present embodiment, according to the corresponding weight of keyword and keyword included in each sentence, determine each
The process of the weight of sentence is specifically as follows, and for each sentence, obtains in the sentence included keyword, keyword
The corresponding weight of frequency of occurrence, keyword;The frequency of occurrence of each keyword and the product of weight are calculated, numerical value is obtained, by institute
Including the numerical value of each keyword sum up, obtain the weight of sentence.
In the present embodiment, acquisition module 21 obtains the process of crucial sentence corresponding period and is specifically as follows, and obtains and closes
The corresponding interlude stamp of key sentence;According to the corresponding interlude stamp of crucial sentence and preset threshold, key sentence is determined
The corresponding period;The start time point of period is the difference of interlude stamp and preset threshold, the termination time of period
Point is that interlude stabs with preset threshold and value.
Wherein, the crucial sentence corresponding period can be located at the start time point according to crucial sentence and terminate the time
In period determined by point.In the present embodiment, article generating means can select most complete video frame out of video-frequency band, make
For key video sequence frame.
In the present embodiment, after generating article, the corresponding chained address of article is can be generated in article generating means.It is regarded in publication
When frequency, the corresponding chained address of article is shown on the publication page of video, so that user can first lead to before watching video
Chained address browsing article is crossed, determines whether video is that oneself wants the video of viewing, and then determines whether to watch according to article
Video etc..
In addition, the corresponding chained address of video can be generated in article generating means after generating article.By the corresponding chain of video
Ground connection location is shown on the page where article, so that user is after watching article, to the interested situation of article content
Under, chained address can be clicked directly on, to watch video.
The article generating means of the embodiment of the present invention identify voice by obtaining video and corresponding voice,
Obtain each sentence;The characteristic information for obtaining each sentence carries out paragraph division to each sentence according to characteristic information, obtains section
Fall sequence;For each paragraph in paragraph sequence, the crucial sentence in paragraph is obtained;Obtain the crucial sentence corresponding time
Section, selects key video sequence frame as the corresponding picture of paragraph out of the period in video corresponding video-frequency band;According to paragraph sequence
In each paragraph and corresponding picture generate article, wherein in article include each paragraph and corresponding picture, can
Effective embodiment video content improves video playing efficiency so that user easily chooses the video of desired viewing.
Fig. 4 is the structural schematic diagram of another article generating means provided in an embodiment of the present invention.This article generating means
Include:
Memory 1001, processor 1002 and it is stored in the calculating that can be run on memory 1001 and on processor 1002
Machine program.
Processor 1002 realizes the article generation method provided in above-described embodiment when executing described program.
Further, article generating means further include:
Communication interface 1003, for the communication between memory 1001 and processor 1002.
Memory 1001, for storing the computer program that can be run on processor 1002.
Memory 1001 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-
Volatile memory), a for example, at least magnetic disk storage.
Processor 1002 realizes article generation method described in above-described embodiment when for executing described program.
If memory 1001, processor 1002 and the independent realization of communication interface 1003, communication interface 1003, memory
1001 and processor 1002 can be connected with each other by bus and complete mutual communication.The bus can be industrial standard
Architecture (Industry Standard Architecture, referred to as ISA) bus, external equipment interconnection
(Peripheral Component, referred to as PCI) bus or extended industry-standard architecture (Extended Industry
Standard Architecture, referred to as EISA) bus etc..The bus can be divided into address bus, data/address bus, control
Bus processed etc..Only to be indicated with a thick line in Fig. 4, it is not intended that an only bus or a type of convenient for indicating
Bus.
Optionally, in specific implementation, if memory 1001, processor 1002 and communication interface 1003, are integrated in one
It is realized on block chip, then memory 1001, processor 1002 and communication interface 1003 can be completed mutual by internal interface
Communication.
Processor 1002 may be a central processing unit (Central Processing Unit, referred to as CPU), or
Person is specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC) or quilt
It is configured to implement one or more integrated circuits of the embodiment of the present invention.
The present invention also provides a kind of non-transitorycomputer readable storage mediums, are stored thereon with computer program, the journey
Article generation method as described above is realized when sequence is executed by processor.
The present invention also provides a kind of computer program products, when the instruction processing unit in the computer program product executes
When, realize article generation method as described above.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not
It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office
It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field
Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples
It closes and combines.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance
Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or
Implicitly include at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two, three
It is a etc., unless otherwise specifically defined.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes
It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion
Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable
Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use
In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction
The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass
Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment
It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings
Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory
(ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits
Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable
Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media
His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.Such as, if realized with hardware in another embodiment, following skill well known in the art can be used
Any one of art or their combination are realized: have for data-signal is realized the logic gates of logic function from
Logic circuit is dissipated, the specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile
Journey gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module
It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer
In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above
The embodiment of the present invention is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as to limit of the invention
System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of the invention
Type.
Claims (17)
1. a kind of article generation method characterized by comprising
Video and corresponding voice are obtained, the voice is identified, each sentence is obtained;
The characteristic information for obtaining each sentence carries out paragraph division to each sentence according to characteristic information, obtains paragraph sequence;
For each paragraph in the paragraph sequence, the crucial sentence in the paragraph is obtained;
The crucial sentence corresponding period is obtained, is selected in the corresponding video-frequency band of the period described in the video crucial
Video frame is as the corresponding picture of the paragraph;
According to each paragraph and corresponding picture generation article in the paragraph sequence.
2. each sentence is obtained the method according to claim 1, wherein described identify the voice,
Include:
The voice is identified, each word and the corresponding timestamp of each word are obtained;
Described two adjacent words are calculated according to the corresponding timestamp of described two adjacent words for any two adjacent word
The time difference of language;
Judge whether the time difference is more than or equal to the first difference threshold;
If described two adjacent words are divided into the same sentence by the time difference less than the first difference threshold;
If the time difference is more than or equal to the first difference threshold, described two adjacent words are divided into different sentences,
Obtain each sentence.
3. the method according to claim 1, wherein when including: that sentence is corresponding intermediate in the characteristic information
Between stab, whether have conjunction in sentence;
It is described that paragraph division is carried out to each sentence according to characteristic information, obtain paragraph sequence, comprising:
For the adjacent sentence of any two, is stabbed according to the corresponding interlude of described two adjacent sentences, calculate described two phases
The time difference of adjacent sentence;
Judge whether the time difference is more than or equal to the sentence in the second difference threshold and described two adjacent sentences rearward
Whether conjunction is had;
If the time difference has conjunction less than the sentence in the second difference threshold or described two adjacent sentences rearward,
Then described two adjacent sentences are divided into identical paragraph;
If the time difference is more than or equal to the second difference threshold, and the sentence in described two adjacent sentences rearward does not connect
Described two adjacent sentences are then divided into different paragraphs, obtain paragraph sequence by word.
4. according to the method described in claim 3, it is characterized in that, the determination method of second difference threshold is,
According to the time difference of the adjacent sentence of any two, time difference set is generated;
According to the time difference set, the standard deviation for determining the time difference set is calculated;
By the product of the standard deviation and predetermined coefficient, it is determined as second difference threshold.
5. method according to claim 1 or 3, which is characterized in that the characteristic information for obtaining each sentence, according to spy
Reference breath carries out paragraph division to each sentence, after obtaining paragraph sequence, further includes:
For each paragraph in the paragraph sequence, the number of words of the paragraph is obtained;
Judge whether the number of words of the paragraph is less than default number of words threshold value;
If the number of words of the paragraph is less than default number of words threshold value, the paragraph is fallen with adjacent segment rearward and is merged, directly
The number of words of paragraph after to merging is more than or equal to default number of words threshold value.
6. the method according to claim 1, wherein each paragraph in the paragraph sequence, is obtained
Take the crucial sentence in the paragraph, comprising:
Obtain the title of the video;
By all sentences and the title in the paragraph sequence, preset keyword models are inputted, each key is obtained
Word and corresponding weight generate keyword set;
It is obtained for each paragraph in the paragraph sequence according to each sentence query keyword set in the paragraph
Included keyword in each sentence;
According to the corresponding weight of keyword and keyword included in each sentence, the weight of each sentence is determined;
By the maximum sentence of weight corresponding in the paragraph, the crucial sentence being determined as in the paragraph.
7. the method according to claim 1, wherein described obtain the crucial sentence corresponding period, packet
It includes:
Obtain the corresponding interlude stamp of the crucial sentence;
According to the corresponding interlude stamp of the key sentence and preset threshold, the crucial sentence corresponding time is determined
Section;The start time point of the period is the difference of interlude stamp and the preset threshold, the end of the period
Only time point is that the interlude stabs with the preset threshold and value.
8. a kind of article generating means characterized by comprising
Module is obtained to identify the voice for obtaining video and corresponding voice, obtain each sentence;
Division module carries out paragraph division to each sentence according to characteristic information, obtains for obtaining the characteristic information of each sentence
To paragraph sequence;
The acquisition module is also used to obtain the crucial sentence in the paragraph for each paragraph in the paragraph sequence;
Selecting module, for obtaining the crucial sentence corresponding period, the corresponding view of the period described in the video
Select key video sequence frame as the corresponding picture of the paragraph in frequency range;
Generation module, for according to each paragraph and corresponding picture generation article in the paragraph sequence.
9. device according to claim 8, which is characterized in that the acquisition module is specifically used for,
The voice is identified, each word and the corresponding timestamp of each word are obtained;
Described two adjacent words are calculated according to the corresponding timestamp of described two adjacent words for any two adjacent word
The time difference of language;
Judge whether the time difference is more than or equal to the first difference threshold;
If described two adjacent words are divided into the same sentence by the time difference less than the first difference threshold;
If the time difference is more than or equal to the first difference threshold, described two adjacent words are divided into different sentences,
Obtain each sentence.
10. device according to claim 8, which is characterized in that when including: that sentence is corresponding intermediate in the characteristic information
Between stab, whether have conjunction in sentence;
The division module is specifically used for,
For the adjacent sentence of any two, is stabbed according to the corresponding interlude of described two adjacent sentences, calculate described two phases
The time difference of adjacent sentence;
Judge whether the time difference is more than or equal to the sentence in the second difference threshold and described two adjacent sentences rearward
Whether conjunction is had;
If the time difference has conjunction less than the sentence in the second difference threshold or described two adjacent sentences rearward,
Then described two adjacent sentences are divided into identical paragraph;
If the time difference is more than or equal to the second difference threshold, and the sentence in described two adjacent sentences rearward does not connect
Described two adjacent sentences are then divided into different paragraphs, obtain paragraph sequence by word.
11. device according to claim 10, which is characterized in that the determination method of second difference threshold is,
According to the time difference of the adjacent sentence of any two, time difference set is generated;
According to the time difference set, the standard deviation for determining the time difference set is calculated;
By the product of the standard deviation and predetermined coefficient, it is determined as second difference threshold.
12. the device according to claim 8 or 10, which is characterized in that further include: judgment module and merging module;
The acquisition module is also used to obtain the number of words of the paragraph for each paragraph in the paragraph sequence;
The judgment module, for judging whether the number of words of the paragraph is less than default number of words threshold value;
The merging module, when being less than default number of words threshold value for the number of words in the paragraph, by the paragraph and phase rearward
Adjacent paragraph merges, until the number of words of the paragraph after merging is more than or equal to default number of words threshold value.
13. device according to claim 8, which is characterized in that the acquisition module is specifically used for,
Obtain the title of the video;
By all sentences and the title in the paragraph sequence, preset keyword models are inputted, each key is obtained
Word and corresponding weight generate keyword set;
It is obtained for each paragraph in the paragraph sequence according to each sentence query keyword set in the paragraph
Included keyword in each sentence;
According to the corresponding weight of keyword and keyword included in each sentence, the weight of each sentence is determined;
By the maximum sentence of weight corresponding in the paragraph, the crucial sentence being determined as in the paragraph.
14. device according to claim 8, which is characterized in that the selecting module is specifically used for,
Obtain the corresponding interlude stamp of the crucial sentence;
According to the corresponding interlude stamp of the key sentence and preset threshold, the crucial sentence corresponding time is determined
Section;The start time point of the period is the difference of interlude stamp and the preset threshold, the end of the period
Only time point is that the interlude stabs with the preset threshold and value.
15. a kind of article generating means characterized by comprising
Memory, processor and storage are on a memory and the computer program that can run on a processor, which is characterized in that institute
State the article generation method realized as described in any in claim 1-7 when processor executes described program.
16. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the program
The article generation method as described in any in claim 1-7 is realized when being executed by processor.
17. a kind of computer program product realizes such as right when the instruction processing unit in the computer program product executes
It is required that any article generation method in 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811600339.XA CN109743589B (en) | 2018-12-26 | 2018-12-26 | Article generation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811600339.XA CN109743589B (en) | 2018-12-26 | 2018-12-26 | Article generation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109743589A true CN109743589A (en) | 2019-05-10 |
CN109743589B CN109743589B (en) | 2021-12-14 |
Family
ID=66359996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811600339.XA Active CN109743589B (en) | 2018-12-26 | 2018-12-26 | Article generation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109743589B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110245339A (en) * | 2019-06-20 | 2019-09-17 | 北京百度网讯科技有限公司 | Article generation method, device, equipment and storage medium |
CN111883136A (en) * | 2020-07-30 | 2020-11-03 | 潘忠鸿 | Rapid writing method and device based on artificial intelligence |
CN111966839A (en) * | 2020-08-17 | 2020-11-20 | 北京奇艺世纪科技有限公司 | Data processing method and device, electronic equipment and computer storage medium |
CN112733654A (en) * | 2020-12-31 | 2021-04-30 | 支付宝(杭州)信息技术有限公司 | Method and device for splitting video strip |
CN113286173A (en) * | 2021-05-19 | 2021-08-20 | 北京沃东天骏信息技术有限公司 | Video editing method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110239107A1 (en) * | 2010-03-29 | 2011-09-29 | Phillips Michael E | Transcript editor |
CN104794104A (en) * | 2015-04-30 | 2015-07-22 | 努比亚技术有限公司 | Multimedia document generating method and device |
CN106134216A (en) * | 2014-04-11 | 2016-11-16 | 三星电子株式会社 | Broadcast receiver and method for clip Text service |
CN106982344A (en) * | 2016-01-15 | 2017-07-25 | 阿里巴巴集团控股有限公司 | video information processing method and device |
CN107305541A (en) * | 2016-04-20 | 2017-10-31 | 科大讯飞股份有限公司 | Speech recognition text segmentation method and device |
-
2018
- 2018-12-26 CN CN201811600339.XA patent/CN109743589B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110239107A1 (en) * | 2010-03-29 | 2011-09-29 | Phillips Michael E | Transcript editor |
CN106134216A (en) * | 2014-04-11 | 2016-11-16 | 三星电子株式会社 | Broadcast receiver and method for clip Text service |
CN104794104A (en) * | 2015-04-30 | 2015-07-22 | 努比亚技术有限公司 | Multimedia document generating method and device |
CN106982344A (en) * | 2016-01-15 | 2017-07-25 | 阿里巴巴集团控股有限公司 | video information processing method and device |
CN107305541A (en) * | 2016-04-20 | 2017-10-31 | 科大讯飞股份有限公司 | Speech recognition text segmentation method and device |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110245339A (en) * | 2019-06-20 | 2019-09-17 | 北京百度网讯科技有限公司 | Article generation method, device, equipment and storage medium |
CN111883136A (en) * | 2020-07-30 | 2020-11-03 | 潘忠鸿 | Rapid writing method and device based on artificial intelligence |
CN111966839A (en) * | 2020-08-17 | 2020-11-20 | 北京奇艺世纪科技有限公司 | Data processing method and device, electronic equipment and computer storage medium |
CN112733654A (en) * | 2020-12-31 | 2021-04-30 | 支付宝(杭州)信息技术有限公司 | Method and device for splitting video strip |
CN113286173A (en) * | 2021-05-19 | 2021-08-20 | 北京沃东天骏信息技术有限公司 | Video editing method and device |
CN113286173B (en) * | 2021-05-19 | 2023-08-04 | 北京沃东天骏信息技术有限公司 | Video editing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN109743589B (en) | 2021-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109743589A (en) | Article generation method and device | |
US10140368B2 (en) | Method and apparatus for generating a recommendation page | |
CN109862432A (en) | Clicking rate prediction technique and device | |
CN105138568B (en) | Search result shows method, apparatus and search engine | |
CN104182481B (en) | Resource recommendation method and device | |
CN110188350A (en) | Text coherence calculation method and device | |
CN106844341A (en) | News in brief extracting method and device based on artificial intelligence | |
CN109286850A (en) | A kind of video labeling method and terminal based on barrage | |
CN109511015A (en) | Multimedia resource recommended method, device, storage medium and equipment | |
CN106101846A (en) | A kind of information processing method and device, terminal | |
CN103076950B (en) | A kind of management method of threads of conversation list | |
CN104699696A (en) | File recommendation method and device | |
CN109582882A (en) | Search result shows method, apparatus and electronic equipment | |
TW201717067A (en) | System, method and computer readable recording media for issue display | |
CN107748802A (en) | Polymerizable clc method and device | |
US20200151220A1 (en) | Interactive representation of content for relevance detection and review | |
CN104216885A (en) | Recommending system and method with static and dynamic recommending reasons automatically combined | |
US20240330581A1 (en) | Method for automatically generating responsive media | |
CN109710773A (en) | The generation method and its device of event body | |
CN108874674A (en) | page debugging method and device | |
CN106970985A (en) | Information flow channel classification exchange method, device and the server guided based on demand | |
CN109739367A (en) | Candidate word list generation method and device | |
CN104657480B (en) | Caricature searching method and device | |
CN106055688A (en) | Search result display method and device and mobile terminal | |
CN109710840A (en) | The appraisal procedure and device of article content depth |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |