CN105975497A - Automatic microblog topic recommendation method and device - Google Patents
Automatic microblog topic recommendation method and device Download PDFInfo
- Publication number
- CN105975497A CN105975497A CN201610268830.1A CN201610268830A CN105975497A CN 105975497 A CN105975497 A CN 105975497A CN 201610268830 A CN201610268830 A CN 201610268830A CN 105975497 A CN105975497 A CN 105975497A
- Authority
- CN
- China
- Prior art keywords
- text
- topic
- content
- microblogging
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an automatic microblog topic recommendation method and device. The method comprises the following steps: processing text contents of microblogs on the basis of a neural network model so as to obtain feature vectors; classifying the text contents of the microblogs through a softmax classifier according to the feature vectors so as to obtain topic classes; and automatically carrying out topic recommendation on the microblogs without topics according to the topic classes. According to the method, the users and the microblog management platform can be helped to manage massive microblog contents. The invention furthermore discloses an automatic microblog topic recommendation device.
Description
Technical field
The present invention relates to Computer Applied Technology and field of social network, particularly relate to a kind of microblog topic auto recommending method and
Device.
Background technology
Text representation is a vital step in the tasks such as web search, information sifting, sentiment analysis.In tradition
In machine learning method, generally occur with the form of character representation.Character representation method the most frequently used in Textual study is word bag
Submodel.In word bag pattern, the most frequently used feature be word, binary phrase, polynary phrase (n-gram) and some
The template characteristic of work extraction.After representing text with the form of feature, conventional model often use word frequency, mutual information,
PLSA (Probability Latent Semantic Analysis, probability dive semantic analysis), LDA (Latent Dirichlet
Allocation, document subject matter generate model) etc. method filter out maximally effective feature.But, traditional method represent text time,
Contextual information can be ignored, also can lose word order information simultaneously.
In recent years, pre-training term vector and deep neural network model are that natural language processing brings new thinking.At word
With the help of vector, it is thus proposed that the method for some combination semantemes represents the semanteme of text.Recognition with Recurrent Neural Network can be at O (n)
The semanteme of text is built in time.This model by word process whole document, and all semantemes above be saved in one fix
In the hidden layer of size.The advantage of Recognition with Recurrent Neural Network is that it can preferably catch contextual information, upper to distance
Context information is modeled.But, Recognition with Recurrent Neural Network is one inclined model, such as the Recognition with Recurrent Neural Network for forward
For, the word that in text, word rearward is the most forward occupies more leading status.Due to the characteristic of this semantic biasing, follow
Ring neutral net build whole text semantic time, the information of text aft section can be comprised more.But the most also
The emphasis of not all text is all placed on finally, and this may affect the degree of accuracy of its semantic expressiveness generated.
For the problem solving semantic biasing, it is thus proposed that build text semantic with convolutional neural networks.Convolutional neural networks
Utilizing maximum pond technology can find out most useful text fragments from text, its complexity is also O (n).Therefore convolutional Neural
Network has bigger potentiality when building text semantic.But, the model of existing convolutional neural networks always uses fairly simple
Convolution kernel, such as stationary window.When using this class model, how to determine that window size is a key issue.Work as window
Time the least, contextual information may be caused to retain deficiency, it is difficult to word is accurately portrayed;And when window is the biggest, can lead
Cause parameter is too much, increases model optimization difficulty.Accordingly, it would be desirable to consider, how to build model, could preferably capture up and down
Literary composition information, reduces the difficulty that selection window size is brought.
Research for short text theme receives much concern always, and text message carries out theme the most preparatively dividing is to need
A problem to be solved.
Summary of the invention
The purpose of the present invention is intended to solve one of above-mentioned technical problem the most to a certain extent.
To this end, the first of the present invention purpose is to propose a kind of microblog topic auto recommending method.The method can help to use
Family and the content of microblog of microblog management magnanimity.
Second object of the present invention is to propose a kind of automatic recommendation apparatus of microblog topic.
For reaching above-mentioned purpose, the microblog topic auto recommending method of first aspect present invention embodiment, including: based on nerve net
Network model carries out process to the content of text of microblogging and obtains characteristic vector;By softmax grader according to described characteristic vector
The content of text of described microblogging is carried out classification and obtains topic classification;According to described topic classification automatically to not containing the micro-of topic
Win and carry out topic recommendation.
The microblog topic auto recommending method of the embodiment of the present invention, is primarily based on neural network model and enters the content of text of microblogging
Row process obtains characteristic vector, then passes through softmax grader and classifies the content of text of microblogging according to characteristic vector
Obtain topic classification, automatically the microblogging not containing topic is carried out topic recommendation finally according to topic classification.The method can be helped
Help user and the content of microblog of microblog management magnanimity.
In some instances, described neural network model includes: convolutional neural networks model and Recognition with Recurrent Neural Network model.
In some instances, described based on neural network model, the content of text of microblogging carried out process to obtain characteristic vector concrete
Including: remove the gibberish in the content of text of described microblogging, and obtain newly according to disabling the vocabulary useless stop words of removal
Content of text;Each the most single by what the sentence of described new content of text was carried out that convolution operation obtains in described sentence
The local feature of unit, and described local feature is carried out maximum operation obtain the characteristic vector of described sentence;Finally utilize and follow
Ring neutral net processes the characteristic vector of the content of text obtaining described microblogging to the characteristic vector of described sentence.
In some instances, described gibberish includes :@information, URL information and pictorial information.
For reaching above-mentioned purpose, the automatic recommendation apparatus of microblog topic of second aspect present invention embodiment, including: processing module,
Characteristic vector is obtained for the content of text of microblogging being carried out process based on neural network model;Sort module, passes through softmax
Grader carries out classification according to described characteristic vector to the content of text of described microblogging and obtains topic classification;Automatically recommending module,
For automatically the microblogging not containing topic being carried out topic recommendation according to described topic classification.
The automatic recommendation apparatus of microblog topic of the embodiment of the present invention, first processing module are based on the neural network model literary composition to microblogging
This content carries out process and obtains characteristic vector, and then sort module passes through softmax grader according to characteristic vector to microblogging
Content of text carries out classification and obtains topic classification, last recommending module automatically according to topic classification automatically to not containing the micro-of topic
Win and carry out topic recommendation.This device can help user and the content of microblog of microblog management magnanimity.
In some instances, described neural network model includes: convolutional neural networks model and Recognition with Recurrent Neural Network model.
In some instances, described processing module specifically for: remove the gibberish in the content of text of described microblogging, and
New content of text is obtained according to disabling the vocabulary useless stop words of removal;By the sentence of described new content of text is carried out
Convolution operation obtains the local feature of each elementary cell in described sentence, and described local feature is carried out maximum operation
Obtain the characteristic vector of described sentence;Finally utilize Recognition with Recurrent Neural Network that the characteristic vector of described sentence is carried out process and obtain institute
State the characteristic vector of the content of text of microblogging.
In some instances, described gibberish includes :@information, URL information and pictorial information.
Aspect and advantage that the present invention adds will part be given in the following description, and part will become bright from the following description
Aobvious, or recognized by the practice of the present invention.
Accompanying drawing explanation
Above-mentioned and/or the additional aspect of the present invention and advantage the accompanying drawings below description to embodiment will be apparent from from combining and
Easy to understand, wherein:
Fig. 1 is the flow chart of microblog topic auto recommending method according to an embodiment of the invention;
Fig. 2 is the schematic diagram of the convolutional layer of convolutional neural networks model according to an embodiment of the invention;
Fig. 3 is the flow chart that text semantic is built by Recognition with Recurrent Neural Network model according to an embodiment of the invention;
Fig. 4 is the schematic diagram of the characteristic vector of the content of text of microblogging according to an embodiment of the invention;
Fig. 5 is according to the flow chart of the microblog topic auto recommending method of one specific embodiment of the present invention;And
The schematic diagram of the automatic recommendation apparatus of Fig. 6 microblog topic according to an embodiment of the invention.
Detailed description of the invention
Embodiments of the invention are described below in detail, and the example of described embodiment is shown in the drawings, the most identical
Or similar label represents same or similar element or has the element of same or like function.Retouch below with reference to accompanying drawing
The embodiment stated is exemplary, it is intended to is used for explaining the present invention, and is not considered as limiting the invention.
Fig. 1 is the flow chart of microblog topic auto recommending method according to an embodiment of the invention.
As it is shown in figure 1, this microblog topic auto recommending method may include that
S101, carries out process based on neural network model to the content of text of microblogging and obtains characteristic vector.
It should be noted that in some instances, neural network model may include that convolutional neural networks model and circulation god
Through network model.
It is understood that the high fault tolerance of neutral net and nonlinearity descriptive power make it be studied widely and answer
With, and the wherein outstanding person in the most each Connectionist model of convolutional neural networks.Convolutional neural networks be one multi-level
Neutral net, each layer is made up of multiple two dimensional surfaces, and each plane is made up of multiple independent neurons.In network
Comprise computation layer and feature extraction layer.Usually, the input of each neuron is connected with the local field of preceding layer, and carries
Taking the feature of this local, once this local feature is extracted, and the position relationship between it and other features is just determined;Each
Computation layer is made up of multiple Feature Mapping, and each Feature Mapping is a plane, and the neuron weights in each plane are identical.
Owing to the neuron on each mapping face shares weights, decrease the free parameter number of network, and then reduce network parameter
The time complexity selected.In network, the output connection value of neuron meets " maximum detection is assumed ", i.e. in a certain zonule
In the neuronal ensemble of interior existence, the neuron only exporting maximum just strengthens output valve.According to hypothesis, only one of which is neural
Unit can strengthen.The unit of convolutional neural networks is exactly maximum output unit, and also controls the strengthening result of neighbouring unit.Volume
Long-pending neutral net, except input and output layer, also convolutional layer, sampling layer and full articulamentum, has in convolutional layer and sampling layer
Several characteristic patterns, each layer has multiple plane, constantly revises neuron weights when training.Conplane neuron is weighed
It is worth identical, so can have the displacement of same degree, rotational invariance.Due to weights share, so from a plane to
The mapping of individual plane can be regarded as convolution algorithm.Between hidden layer and hidden layer, spatial resolution is successively decreased, every layer contain flat
Face number is incremented by, and so can be used for detecting more characteristic information.In convolutional layer, the characteristic pattern of preceding layer and one can learn
Core carries out convolution, and the output after activation primitive of the result of convolution forms the neuron of this layer, thus constitutes this layer of feature
Figure.Convolutional layer for example, as shown in Figure 2.Convolutional neural networks can by three methods realize displacement, scaling and
Distortion invariance, i.e. local receptive field, weights are shared and secondary sampling.Local receptive field refers to the neuron of each layer network
Only being connected with the neural unit in a small neighbourhood of last layer, by local receptive field, each neuron can extract primary
Feature;Weights are shared and are made convolutional neural networks have more preferable parameter, need relatively little of training data and time;Secondary
Sampling can reduce the resolution of feature, it is achieved the invariance to the distortion of displacement, scaling and other forms.
It addition, Recognition with Recurrent Neural Network can build the semanteme of text within O (n) time.This model processes whole document, and handle by word
All semantemes above are saved in the hidden layer of a fixed size.Recognition with Recurrent Neural Network is again to be that it can be preferably
Catch contextual information, the contextual information of distance is modeled.But, Recognition with Recurrent Neural Network is one inclined mould
Type, as the Recognition with Recurrent Neural Network of forward, the word that in text, word rearward is the most forward occupies more leading status.
Due to the characteristic of this semantic biasing, Recognition with Recurrent Neural Network is when building whole text semantic, after comprising text more
The information of face part.When obtaining the semantic expressiveness of sentence and document, it is easy to expect directly along the distribution hypothesis of word, right
Document is modeled.But, if using distribution hypothesis to directly generate the vector representation of sentence or document, can run into greatly
Sparse Problem.If sentence is regarded as an entirety, word vector model trains the expression of sentence, due to the biggest
Most sentences were because only occurring once, and the result of training will have no statistical significance.On the other hand, distribution hypothesis is for word
The hypothesis of justice, this most effective to sentence and document by the way of context obtains semanteme, it need to discuss.Therefore need
Seeking new thinking to be modeled sentence and document, Recognition with Recurrent Neural Network is exactly a kind of model having very much.Circulation nerve net
Network is proposed in nineteen ninety first by Elman et al..The core of this model is each that inputted in text by endless form one by one
Word, and safeguard a hidden layer, retain all of information above.Recognition with Recurrent Neural Network is a special case of recurrent neural network,
It is believed that its correspondence is the right subtree of any one the non-leaf node tree that is leaf node.This special construction makes
Recognition with Recurrent Neural Network has two features: one, owing to securing network structure, and model only can need to build within O (n) time
The semanteme of text.This makes Recognition with Recurrent Neural Network more efficiently can be modeled the semanteme of text.Two, from network structure
On see, the number of plies of Recognition with Recurrent Neural Network is very deep, has several word in sentence, and which floor network just has.Therefore, tradition side is used
During method training Recognition with Recurrent Neural Network, can run into gradient decay or the problem of gradient blast, it is the most square that this needs model to use
Method realizes optimization process.Recognition with Recurrent Neural Network is to the building process of text semantic as shown in Figure 3: each word with represent on all
The hidden layer of literary composition is combined into new hidden layer, from the first of text word cycle calculations to last word.When mode input institute
After some words, the hidden layer that last word is corresponding represents the semanteme of whole text.In optimal way, circulation nerve
Network and other network structure the most slightly difference.In common neutral net, back-propagation algorithm can utilize the chain of derivative
Formula rule directly calculates and obtains.But in Recognition with Recurrent Neural Network, owing to its hidden layer is to the weight matrix of next hidden layer
H is multiplexing, directly extremely difficult to weight matrix derivation.The simplest optimal way of Recognition with Recurrent Neural Network is anti-along the time
To communications.First network is launched by the method, marks sample for each, the model reverse biography by general network
Hidden layer is updated by technology of broadcasting one by one, and repeatedly updates weight matrix H therein.Due to the problem of gradient decay, use BPTT
When optimizing Recognition with Recurrent Neural Network, only propagate the fixing number of plies.In order to solve gradient attenuation problem, Hochreiter and
Schmidhuber proposed LSTM model in 1997.This model introduces mnemon, can preserve distance information,
It it is a kind of conventional prioritization scheme of Recognition with Recurrent Neural Network.
Specifically, in some instances, based on neural network model, the content of text of microblogging is carried out process and obtain characteristic vector
Specifically include: remove the gibberish in the content of text of microblogging, and obtain newly according to disabling the vocabulary useless stop words of removal
Content of text.The local of each elementary cell in sentence is obtained by the sentence of new content of text being carried out convolution operation
Feature, and local feature is carried out maximum operation obtain the characteristic vector of sentence.Finally utilize Recognition with Recurrent Neural Network to sentence
Characteristic vector carry out processing the characteristic vector of the content of text obtaining microblogging.Wherein, the characteristic vector of the content of text of microblogging
As shown in Figure 4.
More specifically, in some instances, gibberish includes :@information, URL information and pictorial information.Can manage
Solve, remove the gibberish such as information, URL information, pictorial information etc. in microblogging text, then micro-to Chinese
Rich content carries out word segmentation processing, and removes useless stop words according to Chinese stoplist.
It should be noted that the useful information of microblogging text data refers to content of microblog.
Wherein, the local spy of each elementary cell in sentence is obtained by the sentence of new content of text being carried out convolution operation
Levy, and local feature is carried out maximum operation obtain the characteristic vector of sentence.It is understood that content of microblog is carried out
The vector representation study of sentence level.Give and comprise N number of ultimate unit (r1, r2..., rN) sentence x, word rank sentence basic
Unit is single word, and the ultimate unit of word rank sentence is the word after participle.When calculating Sentence-level another characteristic, meeting
Run into two main problems: the length of different sentences is different, and important information appears in the optional position of sentence.Utilize volume
Model is set up in sentence by lamination, calculates Sentence-level another characteristic, can solve two problems above-mentioned.Grasped by convolution
Work can obtain each ultimate unit id local feature in sentence, then the local feature obtained is carried out maximum operation, from
And obtain the sentence characteristics vector of a regular length.Comprising N number of ultimate unit (r1, r2..., rN) sentence x in, convolutional layer
It is that the continuous window of k carries out matrix-vector operation to each size.Size k of convolution window is different, the local message of acquisition
Also different.The k being set suitable size by experiment preliminary stage is learnt.The sentence characteristics vector that all convolutional layers are generated
Concatenate, obtain the characteristic vector of a new sentence.
S102, carries out classification according to characteristic vector to the content of text of microblogging by softmax grader and obtains topic classification.
It should be noted that grader can be but not limited to softmax grader.
S103, carries out topic recommendation to the microblogging not containing topic automatically according to topic classification.
The microblog topic auto recommending method of the embodiment of the present invention, is primarily based on neural network model and enters the content of text of microblogging
Row process obtains characteristic vector, then passes through softmax grader and classifies the content of text of microblogging according to characteristic vector
Obtain topic classification, automatically the microblogging not containing topic is carried out topic recommendation finally according to topic classification.The method can be helped
Help user and the content of microblog of microblog management magnanimity.
For example, as shown in Figure 5: based on the microblog topic auto recommending method based on Recognition with Recurrent Neural Network in the present invention,
The present invention develops a set of for the automatic commending system of Sina's microblog topic.The new content of microblog that this system of users is issued is carried out
Recommend to include two stages: be first the automatic pretreatment stage of system, original content of microblog carry out data cleansing, so
The rear vector representation utilizing convolutional neural networks and Recognition with Recurrent Neural Network to obtain microblogging rank;Next to that rank recommended in the topic of system
Section, system is called the softmax disaggregated model trained and as feature, microblogging vector representation is carried out topic classification, by topic
Classification recommends user.The recommendation results of this system can help user and microblog effectively to manage massive micro-blog data.
In order to those skilled in the art become more apparent upon microblog topic auto recommending method, illustrate below in conjunction with Fig. 6: in the face of one
New microblogging, obtains the vector representation of sentence level, then further with circulation god first by convolutional neural networks
Go out the vector representation of microblogging rank through e-learning, then utilize the model trained to carry out topic classification, by topic
Classification automatically recommend user.
The microblog topic auto recommending method of the embodiment of the present invention, is primarily based on neural network model and enters the content of text of microblogging
Row process obtains characteristic vector, then passes through softmax grader and classifies the content of text of microblogging according to characteristic vector
Obtain topic classification, automatically the microblogging not containing topic is carried out topic recommendation finally according to topic classification.The method can be helped
Help user and the content of microblog of microblog management magnanimity.
Corresponding with the microblog topic auto recommending method that above-described embodiment provides, a kind of embodiment of the present invention also provides for one
The automatic recommendation apparatus of microblog topic, the automatic recommendation apparatus of microblog topic provided due to the embodiment of the present invention carries with above-described embodiment
The microblog topic auto recommending method of confession has same or analogous technical characteristic, therefore in the aforementioned microblog topic side of recommendation automatically
The embodiment of method is also applied for the automatic recommendation apparatus of microblog topic that the present embodiment provides, and retouches the most in detail
State.As shown in Figure 6, the automatic recommendation apparatus of this microblog topic comprises the steps that processing module 110, sort module 120, automatically pushes away
Recommend module 130.
Wherein, processing module 110 obtains characteristic vector for the content of text of microblogging being carried out process based on neural network model.
Sort module 120 carries out classification according to characteristic vector to the content of text of microblogging by softmax grader and obtains topic class
Not.
Automatically recommending module 130 is for automatically carrying out topic recommendation to the microblogging not containing topic according to topic classification.
In some instances, neural network model includes: convolutional neural networks model and Recognition with Recurrent Neural Network model.
In some instances, processing module 110 specifically for: remove microblogging content of text in gibberish, and according to
Disable the vocabulary useless stop words of removal and obtain new content of text;By the sentence of new content of text is carried out convolution operation
Obtain the local feature of each elementary cell in sentence, and local feature is carried out maximum operation obtain the feature of sentence to
Amount;Finally utilize Recognition with Recurrent Neural Network that the characteristic vector of sentence is processed the characteristic vector of the content of text obtaining microblogging.
In some instances, gibberish includes :@information, URL information and pictorial information.
The automatic recommendation apparatus of microblog topic of the embodiment of the present invention, first processing module are based on the neural network model literary composition to microblogging
This content carries out process and obtains characteristic vector, and then sort module passes through softmax grader according to characteristic vector to microblogging
Content of text carries out classification and obtains topic classification, last recommending module automatically according to topic classification automatically to not containing the micro-of topic
Win and carry out topic recommendation.This device can help user and the content of microblog of microblog management magnanimity.
In describing the invention, it is to be understood that term " first ", " second " are only used for describing purpose, and can not
It is interpreted as instruction or hint relative importance or the implicit quantity indicating indicated technical characteristic.Thus, define " the
One ", the feature of " second " can express or implicitly include at least one this feature.In describing the invention, " multiple "
It is meant that at least two, such as two, three etc., unless otherwise expressly limited specifically.
In the description of this specification, reference term " embodiment ", " some embodiments ", " example ", " concrete example ",
Or specific features, structure, material or the feature that the description of " some examples " etc. means to combine this embodiment or example describes
It is contained at least one embodiment or the example of the present invention.In this manual, need not to the schematic representation of above-mentioned term
Identical embodiment or example must be directed to.And, the specific features of description, structure, material or feature can be in office
One or more embodiments or example combine in an appropriate manner.Additionally, in the case of the most conflicting, this area
The feature of the different embodiments described in this specification or example and different embodiment or example can be tied by technical staff
Close and combination.
In flow chart or at this, any process described otherwise above or method description are construed as, and represent and include one
Or the module of code, fragment or the part of the executable instruction of the more step for realizing specific logical function or process,
And the scope of the preferred embodiment of the present invention includes other realization, wherein can not press order that is shown or that discuss,
Including according to involved function by basic mode simultaneously or in the opposite order, performing function, this should be by the present invention's
Embodiment person of ordinary skill in the field understood.
Those skilled in the art are appreciated that realizing all or part of step that above-described embodiment method carries is can
Completing instructing relevant hardware by program, described program can be stored in a kind of computer-readable recording medium,
This program upon execution, including one or a combination set of the step of embodiment of the method.
Although above it has been shown and described that embodiments of the invention, it is to be understood that above-described embodiment is exemplary,
Being not considered as limiting the invention, those of ordinary skill in the art within the scope of the invention can be to above-described embodiment
It is changed, revises, replaces and modification.
Claims (8)
1. a microblog topic auto recommending method, it is characterised in that including:
Based on neural network model, the content of text of microblogging is carried out process and obtain characteristic vector;
According to described characteristic vector, the content of text of described microblogging is carried out classification by softmax grader and obtain topic classification;
Automatically the microblogging not containing topic is carried out topic recommendation according to described topic classification.
2. microblog topic auto recommending method as claimed in claim 1, it is characterised in that described neural network model includes:
Convolutional neural networks model and Recognition with Recurrent Neural Network model.
3. microblog topic auto recommending method as claimed in claim 1, it is characterised in that described based on neural network model
The content of text of microblogging carries out process obtain characteristic vector and specifically include:
Remove the gibberish in the content of text of described microblogging, and obtain new according to disabling the vocabulary useless stop words of removal
Content of text;
The local of each elementary cell in described sentence is obtained by the sentence of described new content of text is carried out convolution operation
Feature, and described local feature is carried out maximum operation obtain the characteristic vector of described sentence;
Finally utilize Recognition with Recurrent Neural Network that the characteristic vector of described sentence is processed the spy of the content of text obtaining described microblogging
Levy vector.
4. microblog topic auto recommending method as claimed in claim 3, it is characterised in that described gibberish includes :@
Information, URL information and pictorial information.
5. the automatic recommendation apparatus of microblog topic, it is characterised in that including:
Processing module, obtains characteristic vector for the content of text of microblogging being carried out process based on neural network model;
Sort module, carries out classification according to described characteristic vector to the content of text of described microblogging by softmax grader and obtains
Topic classification;
Automatically recommending module, for automatically carrying out topic recommendation to the microblogging not containing topic according to described topic classification.
6. the automatic recommendation apparatus of microblog topic as claimed in claim 5, it is characterised in that described neural network model includes:
Convolutional neural networks model and Recognition with Recurrent Neural Network model.
7. the automatic recommendation apparatus of microblog topic as claimed in claim 5, it is characterised in that described processing module specifically for:
Remove the gibberish in the content of text of described microblogging, and obtain new according to disabling the vocabulary useless stop words of removal
Content of text;
The local of each elementary cell in described sentence is obtained by the sentence of described new content of text is carried out convolution operation
Feature, and described local feature is carried out maximum operation obtain the characteristic vector of described sentence;
Finally utilize Recognition with Recurrent Neural Network that the characteristic vector of described sentence is processed the spy of the content of text obtaining described microblogging
Levy vector.
8. the automatic recommendation apparatus of microblog topic as claimed in claim 7, it is characterised in that described gibberish includes :@
Information, URL information and pictorial information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610268830.1A CN105975497A (en) | 2016-04-27 | 2016-04-27 | Automatic microblog topic recommendation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610268830.1A CN105975497A (en) | 2016-04-27 | 2016-04-27 | Automatic microblog topic recommendation method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105975497A true CN105975497A (en) | 2016-09-28 |
Family
ID=56993169
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610268830.1A Pending CN105975497A (en) | 2016-04-27 | 2016-04-27 | Automatic microblog topic recommendation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105975497A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844765A (en) * | 2017-02-22 | 2017-06-13 | 中国科学院自动化研究所 | Notable information detecting method and device based on convolutional neural networks |
CN106874410A (en) * | 2017-01-22 | 2017-06-20 | 清华大学 | Chinese microblogging text mood sorting technique and its system based on convolutional neural networks |
CN107273348A (en) * | 2017-05-02 | 2017-10-20 | 深圳大学 | The topic and emotion associated detecting method and device of a kind of text |
CN107832047A (en) * | 2017-11-27 | 2018-03-23 | 北京理工大学 | A kind of non-api function argument based on LSTM recommends method |
CN108021934A (en) * | 2017-11-23 | 2018-05-11 | 阿里巴巴集团控股有限公司 | The method and device of more key element identifications |
CN108038414A (en) * | 2017-11-02 | 2018-05-15 | 平安科技(深圳)有限公司 | Character personality analysis method, device and storage medium based on Recognition with Recurrent Neural Network |
CN108694202A (en) * | 2017-04-10 | 2018-10-23 | 上海交通大学 | Configurable Spam Filtering System based on sorting algorithm and filter method |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101887443A (en) * | 2009-05-13 | 2010-11-17 | 华为技术有限公司 | Method and device for classifying texts |
CN102082619A (en) * | 2010-12-27 | 2011-06-01 | 中国人民解放军理工大学通信工程学院 | Transmission adaptive method based on double credible evaluations |
CN102402621A (en) * | 2011-12-27 | 2012-04-04 | 浙江大学 | Image retrieval method based on image classification |
CN103617230A (en) * | 2013-11-26 | 2014-03-05 | 中国科学院深圳先进技术研究院 | Method and system for advertisement recommendation based microblog |
CN104572892A (en) * | 2014-12-24 | 2015-04-29 | 中国科学院自动化研究所 | Text classification method based on cyclic convolution network |
CN104834747A (en) * | 2015-05-25 | 2015-08-12 | 中国科学院自动化研究所 | Short text classification method based on convolution neutral network |
CN104899298A (en) * | 2015-06-09 | 2015-09-09 | 华东师范大学 | Microblog sentiment analysis method based on large-scale corpus characteristic learning |
CN105447179A (en) * | 2015-12-14 | 2016-03-30 | 清华大学 | Microblog social network based topic automated recommendation method and system |
-
2016
- 2016-04-27 CN CN201610268830.1A patent/CN105975497A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101887443A (en) * | 2009-05-13 | 2010-11-17 | 华为技术有限公司 | Method and device for classifying texts |
CN102082619A (en) * | 2010-12-27 | 2011-06-01 | 中国人民解放军理工大学通信工程学院 | Transmission adaptive method based on double credible evaluations |
CN102402621A (en) * | 2011-12-27 | 2012-04-04 | 浙江大学 | Image retrieval method based on image classification |
CN103617230A (en) * | 2013-11-26 | 2014-03-05 | 中国科学院深圳先进技术研究院 | Method and system for advertisement recommendation based microblog |
CN104572892A (en) * | 2014-12-24 | 2015-04-29 | 中国科学院自动化研究所 | Text classification method based on cyclic convolution network |
CN104834747A (en) * | 2015-05-25 | 2015-08-12 | 中国科学院自动化研究所 | Short text classification method based on convolution neutral network |
CN104899298A (en) * | 2015-06-09 | 2015-09-09 | 华东师范大学 | Microblog sentiment analysis method based on large-scale corpus characteristic learning |
CN105447179A (en) * | 2015-12-14 | 2016-03-30 | 清华大学 | Microblog social network based topic automated recommendation method and system |
Non-Patent Citations (3)
Title |
---|
PUYANG XU,RUHI SARIKAYA: "Contextual domain classification in spoken language understanding systems using recurrent neural network", 《 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)》 * |
刘龙飞,杨亮: "基于卷积神经网络的微博情感倾向性分析", 《中文信息学报》 * |
张剑,屈丹: "基于词向量特征的循环神经网络语言模型", 《模式识别与人工智能》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106874410A (en) * | 2017-01-22 | 2017-06-20 | 清华大学 | Chinese microblogging text mood sorting technique and its system based on convolutional neural networks |
CN106844765A (en) * | 2017-02-22 | 2017-06-13 | 中国科学院自动化研究所 | Notable information detecting method and device based on convolutional neural networks |
CN106844765B (en) * | 2017-02-22 | 2019-12-20 | 中国科学院自动化研究所 | Significant information detection method and device based on convolutional neural network |
CN108694202A (en) * | 2017-04-10 | 2018-10-23 | 上海交通大学 | Configurable Spam Filtering System based on sorting algorithm and filter method |
CN107273348A (en) * | 2017-05-02 | 2017-10-20 | 深圳大学 | The topic and emotion associated detecting method and device of a kind of text |
CN107273348B (en) * | 2017-05-02 | 2020-12-18 | 深圳大学 | Topic and emotion combined detection method and device for text |
CN108038414A (en) * | 2017-11-02 | 2018-05-15 | 平安科技(深圳)有限公司 | Character personality analysis method, device and storage medium based on Recognition with Recurrent Neural Network |
WO2019085329A1 (en) * | 2017-11-02 | 2019-05-09 | 平安科技(深圳)有限公司 | Recurrent neural network-based personal character analysis method, device, and storage medium |
CN108021934A (en) * | 2017-11-23 | 2018-05-11 | 阿里巴巴集团控股有限公司 | The method and device of more key element identifications |
CN108021934B (en) * | 2017-11-23 | 2022-03-04 | 创新先进技术有限公司 | Method and device for recognizing multiple elements |
CN107832047A (en) * | 2017-11-27 | 2018-03-23 | 北京理工大学 | A kind of non-api function argument based on LSTM recommends method |
CN107832047B (en) * | 2017-11-27 | 2018-11-27 | 北京理工大学 | A kind of non-api function argument recommended method based on LSTM |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109376242B (en) | Text classification method based on cyclic neural network variant and convolutional neural network | |
CN110222178B (en) | Text emotion classification method and device, electronic equipment and readable storage medium | |
CN108763326B (en) | Emotion analysis model construction method of convolutional neural network based on feature diversification | |
Cao et al. | A joint model for word embedding and word morphology | |
CN105975497A (en) | Automatic microblog topic recommendation method and device | |
CN107092596A (en) | Text emotion analysis method based on attention CNNs and CCR | |
Alwehaibi et al. | Comparison of pre-trained word vectors for arabic text classification using deep learning approach | |
Fahad et al. | Inflectional review of deep learning on natural language processing | |
CN106886580B (en) | Image emotion polarity analysis method based on deep learning | |
CN110110323B (en) | Text emotion classification method and device and computer readable storage medium | |
CN110083700A (en) | A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks | |
Wahid et al. | Cricket sentiment analysis from Bangla text using recurrent neural network with long short term memory model | |
CN110188195B (en) | Text intention recognition method, device and equipment based on deep learning | |
KR20190063978A (en) | Automatic classification method of unstructured data | |
CN107066445A (en) | The deep learning method of one attribute emotion word vector | |
CN107688576B (en) | Construction and tendency classification method of CNN-SVM model | |
CN107679110A (en) | The method and device of knowledge mapping is improved with reference to text classification and picture attribute extraction | |
Al Wazrah et al. | Sentiment analysis using stacked gated recurrent unit for arabic tweets | |
CN106919557A (en) | A kind of document vector generation method of combination topic model | |
Pan et al. | Deep neural network-based classification model for Sentiment Analysis | |
CN111339772B (en) | Russian text emotion analysis method, electronic device and storage medium | |
CN106446147A (en) | Emotion analysis method based on structuring features | |
Khatun et al. | Authorship Attribution in Bangla literature using Character-level CNN | |
CN108733675A (en) | Affective Evaluation method and device based on great amount of samples data | |
CN110321918A (en) | The method of public opinion robot system sentiment analysis and image labeling based on microblogging |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160928 |
|
RJ01 | Rejection of invention patent application after publication |