CN113298365B - Cultural additional value assessment method based on LSTM - Google Patents
Cultural additional value assessment method based on LSTM Download PDFInfo
- Publication number
- CN113298365B CN113298365B CN202110515653.3A CN202110515653A CN113298365B CN 113298365 B CN113298365 B CN 113298365B CN 202110515653 A CN202110515653 A CN 202110515653A CN 113298365 B CN113298365 B CN 113298365B
- Authority
- CN
- China
- Prior art keywords
- feature
- cultural
- word
- emotion
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000008451 emotion Effects 0.000 claims abstract description 68
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000012360 testing method Methods 0.000 claims abstract description 16
- 238000011156 evaluation Methods 0.000 claims abstract description 13
- 238000004364 calculation method Methods 0.000 claims abstract description 9
- 239000003607 modifier Substances 0.000 claims description 26
- 230000006870 function Effects 0.000 claims description 19
- 230000011218 segmentation Effects 0.000 claims description 19
- 238000004458 analytical method Methods 0.000 claims description 9
- 230000014759 maintenance of location Effects 0.000 claims description 8
- 210000002569 neuron Anatomy 0.000 claims description 6
- 238000012216 screening Methods 0.000 claims description 6
- 238000007493 shaping process Methods 0.000 claims description 6
- 230000003340 mental effect Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 230000007935 neutral effect Effects 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims 2
- 238000013210 evaluation model Methods 0.000 abstract description 5
- 238000011160 research Methods 0.000 abstract description 5
- 238000011002 quantification Methods 0.000 abstract description 3
- 230000007547 defect Effects 0.000 abstract description 2
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000003058 natural language processing Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 230000006872 improvement Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 239000002390 adhesive tape Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 230000031068 symbiosis, encompassing mutualism through parasitism Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Economics (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Game Theory and Decision Science (AREA)
- Probability & Statistics with Applications (AREA)
- Tourism & Hospitality (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application belongs to the technical field of cultural additional value assessment, and relates to a cultural additional value assessment method based on LSTM, which comprises the following steps: constructing a three-dimensional index system based on a person-enterprise-society; step 2: establishing a feature word list representing a comment corpus of cultural products to be evaluated; step 3: extracting a characteristic sentence to obtain characteristic sentence data; step 4: training an LSTM network model; step 5: performing accuracy test and prediction on the LSTM network model to obtain an emotion value; step 6: weighting the indexes of the three-dimensional index system in the step 1; step 7: and establishing a cultural additional value calculation equation model to obtain a cultural additional evaluation value. The method optimizes the defects of excessively subjective evaluation indexes, difficult quantification and the like in the traditional evaluation model, and is suitable for the problems of large scale of comment data under the environment of a research network platform and the like.
Description
Technical Field
The application belongs to the technical field of cultural additional value assessment, relates to a cultural additional value assessment method based on LSTM (long-short-term memory artificial neural network), and particularly relates to a cultural additional value assessment method based on LSTM (long-term memory artificial neural network) neural network.
Background
The rapid development of internet technology has led to a digital economic trend. Under the background of a new era, the literature industry gradually moves to digitization and intellectualization, and brings different cultural experiences to people. The organic fusion of the numbers, the cultures and the platforms derives a series of innovative forms and new business states, so that the created products are not simply reproduced in traditional culture, and the fusion symbiosis of the products and different cultures is realized through a digital technology, so that hollow and rigid cultural symbols are "alive" and more cultural added value is brought to the products. For example, network literature museums create countless "net red products" that are deep enough to gain popularity: countless powders are absorbed by the uterine curbs, and the adhesive tape is in harmony and in the batch to be flushed through the network. The high cultural added value enables the cultural product to meet the mental culture requirement of consumers, become an important means for merchants to win consumer favor, create unique cultural brand images, enable excellent cultures stored therein to enter the lives of ordinary people, and become cultural carriers and propaganda people.
Therefore, the culture additional value improvement is a main trend of the development of the culture industry, and a new round of thinking of culture enterprises and academia is initiated accordingly: how much the added value of the culture improves the original product, how the fusion of different cultural elements and products can improve the added value of the culture, how to use the rules behind the added values to guide the design and branding of the cultural product to shape? ". The resolution of these key questions must first answer: "what constitutes the cultural added value" and "how to measure the cultural added value", however, the research on the two basic problems is still mainly based on qualitative analysis, and the exploration of the quantification method of the cultural added value is lacking. In view of the above, the application analyzes connotation and structure of cultural added value from emotion view; the product comment data on the network platform is used as a support, a cultural additional value assessment method based on LSTM fine granularity emotion analysis is provided, and a reference is provided for subsequent corresponding research.
Disclosure of Invention
The application aims at: a cultural added value evaluation method based on an LSTM neural network is provided, and an index system of the cultural added value and an LSTM emotion analysis evaluation model are constructed to solve the problems in the background technology.
The application is realized by the following technical scheme:
a cultural additional value assessment method based on LSTM comprises the following steps:
step 1: constructing a three-dimensional index system based on personal-enterprise-society from the hierarchical function view of cultural additional value;
step 2: preparing a comment corpus of cultural products to be evaluated, performing word segmentation on the comment corpus, and then establishing a feature word list representing the comment corpus of cultural products to be evaluated based on a TF-IDF algorithm;
step 3: extracting a characteristic sentence to obtain characteristic sentence data;
step 4: training an LSTM network model by utilizing the feature sentence data extracted in the step 3, selecting cross entropy as a loss function parameter, waiting for the convergence of the loss function, and obtaining a learning process curve;
step 5: performing accuracy test and prediction on the LSTM network model to obtain an emotion value;
step 6: weighting the indexes of the three-dimensional index system in the step 1;
step 7: and establishing a cultural additional value calculation equation model to obtain a cultural additional evaluation value.
Based on the technical scheme, the step 1 specifically comprises the following steps: referring to the related documents of the existing cultural additional value evaluation and the hierarchical function view angle, constructing a three-dimensional index system based on a person-enterprise-society;
the three-dimensional index system based on the person-enterprise-society comprises the following steps: 3 primary indexes;
the 3 primary indexes include: cultural mental enjoyment, cultural brand shaping and cultural essence inheritance;
the cultural mental enjoyment includes the following secondary indicators: ornamental value of cultural products and artistic value of cultural products;
the cultural brand shaping comprises the following secondary indexes: the awareness of the cultural brands and the loyalty of the cultural brands;
the cultural essence inheritance comprises the following secondary indexes: inheritance of culture and transmissibility of culture.
On the basis of the technical scheme, the basic unit of the comment stock is a single comment;
the specific steps of the step 2 are as follows:
step 2.1: the comments of the comment database are segmented by calling a segmentation module of the jieba tool, and a corpus segmentation result is obtained;
step 2.2: and setting necessary parameters such as word frequency retention threshold values and the like by adopting a TF-IDF algorithm of a jieba tool to obtain a characteristic word list required for representing the whole comment corpus.
Based on the technical scheme, the specific steps of the step 2.2 are as follows:
step 2.2.1: extracting keywords by using a TF-IDF (word frequency-inverse document frequency) algorithm, wherein the keywords are specifically as follows: the calculation is performed by using the formulas (1), (2) and (3),
wherein TF is ω Word frequency is term omega;
wherein IDF is reverse file frequency; if the number of the valid comment data containing a term is smaller, the IDF is larger, and the term has good category distinguishing capability;
TFIDF=TF ω *IDF (3)
wherein, TFIDF is: word frequency-inverse document frequency;
step 2.2.2: determining a word frequency retention threshold, and screening entries with a value of TFIDF higher than the word frequency retention threshold as keywords (for example, determining that the word frequency retention threshold is 20); such screening tends to filter out common words, preserving relatively important words;
counting word frequency of the keywords by using a Counter library to obtain candidate feature words;
the Counter library is one of python, belongs to the subclass of dictionary, the element is stored as the keyword of dictionary, and the number of times the keyword appears is stored as corresponding value;
and finally, classifying candidate feature words by manual screening and distinguishing according to a three-dimensional index system of a person, an enterprise and a society, and obtaining a feature word list required by representing the whole comment corpus.
On the basis of the technical scheme, the characteristic sentence comprises: displaying the feature sentence and the implicit feature sentence;
the specific steps of the step 3 are as follows:
firstly, extracting explicit characteristic sentences;
traversing word by word for word segmentation results of all the corpus, comparing the word by word with the feature word list in the step 2, and taking the matched feature words as feature attributes of comments where the vocabulary entries are located;
extracting comments with characteristic attributes and marking the comments as explicit characteristic sentences;
performing dependency analysis on the extracted explicit feature sentence by using a Stanford NLP platform, and extracting the modifier of the explicit feature sentence;
the specific steps of extracting modifier words of the explicit feature sentences are as follows: traversing the entry of the explicit feature sentence word by word, comparing the entry with the modifier of the HowNet emotion dictionary, and taking the matched modifier as the modifier of the explicit feature sentence where the entry is located;
the HowNet emotion dictionary comprises: adjectives, nouns, verbs, adverbs, and combinations thereof;
aiming at the explicit feature sentences matched to modifier words, the following processing is carried out:
the feature words of the display feature sentences are used as leading words, the modifier words of the display feature sentences are used as emotion words, and an attribute feature-emotion word pair is constructed, so that an attribute feature-emotion word-attribute emotion word pair weight is obtained;
the attribute features are: dominant words;
and marking the attribute emotion word pair weight as: SQ, calculated according to equation (4),
and a second step of: extracting implicit characteristic sentences;
aiming at the feature sentences which are not matched with the feature words, traversing the vocabulary entries word by word, and comparing the vocabulary entries with modifier words of the HowNet emotion dictionary;
when the feature sentence which is not matched with the feature word is not matched with the modifier word, deleting the feature sentence;
when the feature sentence which is not matched with the feature word is matched with the modifier, the matched modifier is used as the modifier of the feature sentence where the entry is located, and the modifier is used as the emotion word;
then, according to the obtained attribute feature-emotion word-attribute emotion word pair weight, selecting the attribute feature with the largest attribute emotion word pair weight as the feature word of the feature sentence which is not matched with the feature word according to the emotion word in the feature sentence which is not matched with the feature word;
taking the feature sentences which are not matched with the feature words obtained by the feature words as implicit feature sentences;
the standby NLP platform is a natural language processing tool kit, and integrates a plurality of very practical functions, including word segmentation, part-of-speech tagging, syntactic analysis and the like; the Standford NLP platform is not a deep learning framework, but a trained model, which can be analogized to a piece of software; the stanford NLP platform is written in Java language and has a python interface;
namely: for the rest comments which are not matched with the feature words, the feature is not clear enough, the corpus word segmentation result is required to be imported into a stanford NLP platform for sentence-based dependency relation mining, and the feature which is not clear is mined through the step.
Based on the technical scheme, the specific steps of the step 4 are as follows:
step 4.1: manually labeling each feature sentence aiming at the feature sentence extracted in the previous step;
the label expressing positive emotion is marked as +1, the label expressing negative emotion is marked as-1, and the label expressing neutral emotion is marked as 0;
step 4.2: converting the characteristic sentence into a word vector by using word2 vec;
classifying the feature sentences according to the secondary index and the primary index of the feature words matched with the feature sentences;
and taking the word vector, the feature words corresponding to the feature sentences, the classification results of the feature sentences and the labels corresponding to the feature sentences as: feature sentence data;
step 4.3: dividing the feature sentence data into training set data and test set data;
step 4.4: the quantitative ratio of the training set data to the test set data is set to 4:1.
Based on the technical scheme, the specific steps of the step 4 are as follows: training an LSTM network model by using training set data; the LSTM network model is tested using the test set data.
Based on the technical scheme, the activation function of the LSTM network selects tan h function, the word vector dimension value is set to be 100, and the data batch processing capacity is 32, namely 32 samples are selected as input at each time.
In addition, in the deep learning network training process, in order to prevent the overfitting phenomenon, neurons are temporarily discarded from the network according to a certain probability, so that joint adaptability among the neuron nodes is weakened, the generalization capability is enhanced, and the most randomly generated network structure is generated when the neuron discarding rate (namely a dropout value) is set to be 0.5 through cross verification; and selecting the cross entropy as a main parameter for drawing the LSTM network model learning curve, waiting for the curve to converge, and drawing a curve graph.
Based on the technical scheme, the specific steps of the step 5 are as follows: checking the accuracy rate, recall rate and F1 value of the LSTM network model trained in the step 4; and obtaining emotion values of all the secondary indexes by using the test set.
On the basis of the technical scheme, the weights of the indexes of the three-dimensional index system comprise: primary index weights (also known as primary index frequencies) and secondary index weights (also known as secondary index frequencies);
extracting a characteristic sentence with positive emotion;
the first-level index weight is calculated according to a formula (5),
wherein YJ1 is: the occurrence frequency (i.e. the times) of the first-level index feature words matched in the feature sentences with positive emotion, ZS is: the frequency of occurrence of all matched feature words in the feature sentences with positive emotion;
the secondary index weight is calculated according to a formula (6),
wherein EJ2 is: the occurrence frequency of the secondary index feature words matched in the feature sentences with positive emotion, ZS2 is as follows: in the primary index of the matched secondary index feature words in the feature sentences with positive emotion, the occurrence frequency of the feature words.
Based on the technical proposal, the cultural additional value calculation equation model in the step 7 is shown as the formula (7),
cultural additional evaluation value = cultural spirit enjoyment primary index weight (' ornamental ' secondary index weight of cultural product ' ornamental ' index emotion value of cultural product + ' artistic ' secondary index weight of cultural product ' artistic index emotion value) +cultural brand-shaping primary index weight (' awareness of cultural brand ' secondary index weight "+ ' loyalty of cultural brand ' index emotion value of cultural brand) +cultural essence inheritance primary index weight (' inheritance ' secondary index emotion value of cultural) inheritance ' index emotion value + ' transmissibility of cultural ' secondary index weight ' (7).
The beneficial technical effects of the application are as follows:
1. the application constructs a three-dimensional index system based on a person-enterprise-society from the hierarchical function view angle of cultural added value, and constructs the three-dimensional index system based on the person-enterprise-society, wherein the three-dimensional index system comprises 3 primary indexes and 6 secondary indexes. The index system has better systematicness and layering property, and reflects the significance of the perception value research on the development of the cultural industry;
2. and aiming at the cultural added value, adopting a perception value evaluation model of LSTM fine granularity emotion analysis. The method optimizes the defects of excessive subjectivity, difficult quantification and the like of the evaluation index in the traditional evaluation model, and is suitable for the problems of large scale of comment data under the environment of a research network platform and the like.
Drawings
The application has the following drawings:
FIG. 1 is a schematic diagram of a three-dimensional index architecture based on person-enterprise-society according to the present application.
Fig. 2 is a schematic flow chart of the cultural added value assessment method based on LSTM.
Detailed Description
The present application will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1-2, the present application aims at: a cultural added value evaluation method based on an LSTM neural network is provided, and an index system of the cultural added value and an LSTM emotion analysis evaluation model are constructed to solve the problems in the background technology.
The application is realized by the following technical scheme:
a cultural added value assessment method based on LSTM neural network comprises the following steps:
step 1: constructing a three-dimensional index system based on personal-enterprise-society from the perspective of hierarchical functions of cultural added values;
step 2: preparing a comment corpus of cultural products to be evaluated, performing word segmentation on the comment corpus, and then establishing a characteristic word list from the comment corpus of cultural products to be evaluated based on a TF-IDF algorithm;
step 3: extracting a characteristic sentence to obtain characteristic sentence data;
step 4: performing LSTM network model training by using the feature sentence data extracted in the step 3, selecting cross entropy as a loss function parameter, waiting for the convergence of the loss function, and obtaining a learning process curve;
step 6: the LSTM network model accuracy test and test set prediction are carried out, and emotion values are obtained;
step 6: and (3) weighting the indexes of the three-dimensional index system in the step (1).
Step 8: and establishing a cultural additional value calculation equation model to obtain a cultural additional evaluation value.
Further, the step 1 specifically includes: referring to the related documents of the existing cultural additional value evaluation and the hierarchical function view angle, a three-dimensional index system based on a person-enterprise-society is constructed; the culture added value is considered to be represented by the first-level index: the sum of three elements of cultural mental enjoyment, cultural brand shaping and cultural essence inheritance and the mutual relations thereof. On the basis of more comprehensively and evenly covering three traditional characteristic factors of individuals, enterprises and society of cultural products, the essential connotation of cultural elements is combined, and finally 6 secondary indexes are respectively extended, namely ornamental value of the cultural products, artistry of the cultural products, awareness of the cultural brands, inheritance of the cultural brands and transmissibility of the cultural are respectively formed, and finally a cultural additional value index system consisting of 3 primary indexes and 6 secondary indexes is formed.
Further, the step 2 specifically includes: preparing a cultural product comment corpus to be evaluated, wherein the basic unit of the corpus is a single comment, word segmentation is carried out on the corpus by calling a jieba module to obtain a word segmentation result of the corpus, and then parameters such as necessary word frequency retention threshold and the like are set by adopting a TF-IDF algorithm of the jieba to obtain a characteristic word list required for representing the whole comment corpus.
Further, the step 3 is specifically two steps of extracting an explicit feature sentence and an implicit feature sentence. Traversing word segmentation results of the corpus, comparing the word segmentation results with the feature word list in the step 2, and taking the matched feature words as feature attributes of comments where the vocabulary entries are located;
extracting comments with characteristic attributes and marking the comments as explicit characteristic sentences;
for implicit feature sentences with insufficient clear feature attributes, the corpus word segmentation result is imported to a Standford NLP platform to excavate sentence-based dependency relationship, and the undefined feature attributes are excavated through the step.
And step 4, summarizing the characteristic sentences which are described as characteristic attributes under the same index, extracting from the word segmentation result of the comment corpus, and carrying out centralized analysis and classification. Marking word segmentation results of the comment corpus of each category, manually labeling the feature sentences, marking the label expressing positive emotion as +1, marking the label expressing negative emotion as-1, and marking the label expressing neutral emotion as 0;
converting the characteristic sentence into a word vector by using word2 vec;
classifying the feature sentences according to the secondary index and the primary index of the feature words matched with the feature sentences;
and taking the word vector, the feature words corresponding to the feature sentences, the classification results of the feature sentences and the labels corresponding to the feature sentences as: feature sentence data;
dividing the feature sentence data into training set data and test set data;
the quantitative ratio of the training set data to the test set data is set to 4:1.
The step 4 is specifically that based on the word segmentation result of the comment corpus with the tag, an LSTM network model is used for training, a tan h function is selected as an activation function of the model, a word vector dimension value is set to be 100, and a data batch processing amount is 32, namely 32 samples are selected as input at each time. In addition, in order to prevent the over-fitting phenomenon in the deep learning network training process, neurons are temporarily discarded from the network according to a certain probability, so that joint adaptability among neuron nodes is weakened, generalization capability is enhanced, and a random network structure is the largest when a dropout value is set to 0.5 through cross verification. Selecting cross entropy as a main parameter for drawing a model learning curve, waiting for curve convergence, and drawing a curve graph;
the step 5 specifically comprises the following steps: and (3) invoking the LSTM model trained in the step (4) to carry out emotion analysis on the corpus, checking the accuracy rate, recall rate and F1 value of the corpus, judging the performance of the model, and after the performance is confirmed, calculating the emotion values of all the secondary indexes.
The step 6 is specifically as follows: and (3) index weighting, screening the feature sentences with positive emotion polarity based on the classification result in the step (4), determining the corresponding frequency number of the secondary or primary index by comparing the feature word list, respectively calculating the primary index frequency and the secondary index frequency of the feature sentences, and setting the primary index frequency and the secondary index frequency as weights corresponding to the index values.
The step 7 is specifically as follows: and (3) establishing a cultural additional value calculation equation model, and referring to the weights of the indexes of each level formed by the step (6).
For example: cultural additional evaluation value (weighted total score) =0.399 (0.638) ×ornamental "index emotion value of cultural product+0.362) ×artistic" index emotion value of cultural product) +0.296 (0.569) ×knowledgeable "index emotion value of cultural brand+0.431) ×loyalty" index emotion value of cultural brand) +0.305 (0.382) ×inheritance "index emotion value of cultural+0.618) ×transmissibility" index emotion value of cultural
Wherein the decimal is the corresponding weight.
The foregoing description of the preferred embodiments of the application is not intended to limit the application to the form or principles of the application, but rather to cover all modifications, equivalents, alternatives, and improvements within the scope of the application.
What is not described in detail in this specification is prior art known to those skilled in the art.
Claims (6)
1. The LSTM-based cultural additional value assessment method is characterized by comprising the following steps of:
step 1: constructing a three-dimensional index system based on personal-enterprise-society from the hierarchical function view of cultural additional value,
step 2: preparing a comment corpus of cultural products to be evaluated, performing word segmentation on the comment corpus, then establishing a characteristic word list representing the comment corpus of cultural products to be evaluated based on a TF-IDF algorithm,
step 3: extracting the characteristic sentence to obtain characteristic sentence data,
step 4: training LSTM network model by using the feature sentence data extracted in the step 3, selecting cross entropy as loss function parameter, waiting for the convergence of loss function to obtain learning process curve,
step 5: performing accuracy test and prediction on the LSTM network model to obtain emotion values,
step 6: weighting the indexes of the three-dimensional index system in the step 1,
step 7: establishing a cultural additional value calculation equation model to obtain a cultural additional evaluation value;
the three-dimensional index system based on the person-enterprise-society comprises the following steps: 3 primary indexes; the 3 primary indexes include: enjoyment of cultural spirit, modeling of cultural brands and inheritance of cultural essence,
the cultural mental enjoyment includes the following secondary indicators: ornamental value of cultural products and artistic quality of cultural products,
the cultural brand shaping comprises the following secondary indexes: the awareness of cultural brands and the loyalty of cultural brands,
the cultural essence inheritance comprises the following secondary indexes: inheritance of culture and transmissibility of culture; the basic unit of the comment library is a single comment;
the specific steps of the step 2 are as follows:
step 2.1: the comments of the comment database are segmented by calling a segmentation module of the jieba tool to obtain a corpus segmentation result,
step 2.2: setting word frequency retention threshold parameters by adopting a TF-IDF algorithm of a jieba tool to obtain a characteristic word list required for representing the whole comment corpus;
the specific steps of the step 2.2 are as follows:
step 2.2.1: extracting keywords by using a TF-IDF algorithm, specifically: the calculation is performed by using the formulas (1), (2) and (3),
wherein TF is ω Word frequency is term omega;
wherein IDF is reverse file frequency;
TFIDF=TF ω *IDF (3)
wherein, TFIDF is: word frequency-inverse document frequency;
step 2.2.2: determining a word frequency retention threshold, and screening entries with the numerical value of TFIDF higher than the word frequency retention threshold as keywords;
counting word frequency of the keywords by using a Counter library to obtain candidate feature words;
finally, according to a three-dimensional index system of a person, an enterprise and a society, classifying candidate feature words in a grading manner through manual screening and distinguishing, and obtaining a feature word list required for representing the whole comment corpus;
the feature sentence comprises: displaying the feature sentence and the implicit feature sentence;
the specific steps of the step 3 are as follows:
firstly, extracting explicit characteristic sentences;
traversing word by word for word segmentation results of all the corpus, comparing the word by word with the feature word list in the step 2, and taking the matched feature words as feature attributes of comments where the vocabulary entries are located;
extracting comments with characteristic attributes and marking the comments as explicit characteristic sentences;
performing dependency analysis on the extracted explicit feature sentence by using a Stanford NLP platform, and extracting a modifier of the explicit feature sentence;
the specific steps of extracting modifier words of the explicit feature sentences are as follows: traversing the entry of the explicit feature sentence word by word, comparing the entry with the modifier of the HowNet emotion dictionary, and taking the matched modifier as the modifier of the explicit feature sentence where the entry is located;
aiming at the explicit feature sentences matched to modifier words, the following processing is carried out:
the feature words of the display feature sentences are used as leading words, the modifier words of the display feature sentences are used as emotion words, and an attribute feature-emotion word pair is constructed, so that an attribute feature-emotion word-attribute emotion word pair weight is obtained;
the attribute features are: dominant words;
and marking the attribute emotion word pair weight as: SQ, calculated according to equation (4),
and a second step of: extracting implicit characteristic sentences;
aiming at the feature sentences which are not matched with the feature words, traversing the vocabulary entries word by word, and comparing the vocabulary entries with modifier words of the HowNet emotion dictionary;
when the feature sentence which is not matched with the feature word is not matched with the modifier word, deleting the feature sentence;
when the feature sentence which is not matched with the feature word is matched with the modifier, the matched modifier is used as the modifier of the feature sentence where the entry is located, and the modifier is used as the emotion word;
then, according to the obtained attribute feature-emotion word-attribute emotion word pair weight, selecting the attribute feature with the largest attribute emotion word pair weight as the feature word of the feature sentence which is not matched with the feature word according to the emotion word in the feature sentence which is not matched with the feature word;
and taking the feature sentences which are not matched with the feature words as implicit feature sentences.
2. The LSTM based cultural additional value assessment method according to claim 1, wherein: the specific steps of the step 4 are as follows:
step 4.1: manually labeling each feature sentence aiming at the feature sentence extracted in the previous step;
the label expressing positive emotion is marked as +1, the label expressing negative emotion is marked as-1, and the label expressing neutral emotion is marked as 0;
step 4.2: converting the characteristic sentence into a word vector by using word2 vec;
classifying the feature sentences according to the secondary index and the primary index of the feature words matched with the feature sentences;
and taking the word vector, the feature words corresponding to the feature sentences, the classification results of the feature sentences and the labels corresponding to the feature sentences as: feature sentence data;
step 4.3: dividing the feature sentence data into training set data and test set data;
step 4.4: the quantitative ratio of the training set data to the test set data is set to 4:1.
3. The LSTM based cultural additional value assessment method according to claim 2, wherein: the specific steps of the step 4 are as follows: training an LSTM network model by using training set data; testing the LSTM network model by using the test set data;
the activation function of the LSTM network is tan h function, the word vector dimension value is set to be 100, the data batch processing amount is 32, and the neuron discarding rate is set to be 0.5; and selecting the cross entropy as a parameter drawn by the LSTM network model learning curve.
4. The LSTM based cultural additional value assessment method according to claim 3, wherein: the specific steps of the step 5 are as follows: checking the accuracy rate, recall rate and F1 value of the LSTM network model trained in the step 4; and obtaining emotion values of all the secondary indexes by using the test set.
5. The LSTM based cultural additional value assessment method according to claim 4, wherein: the weights of the indexes of the three-dimensional index system comprise: a first level index weight and a second level index weight;
extracting a characteristic sentence with positive emotion;
the first-level index weight is calculated according to a formula (5),
wherein YJ1 is: the occurrence frequency of the first-level index feature words matched in the feature sentences with positive emotion, ZS is: the frequency of occurrence of all matched feature words in the feature sentences with positive emotion;
the secondary index weight is calculated according to a formula (6),
wherein EJ2 is: the occurrence frequency of the secondary index feature words matched in the feature sentences with positive emotion, ZS2 is as follows: in the primary index of the matched secondary index feature words in the feature sentences with positive emotion, the occurrence frequency of the feature words.
6. The LSTM based cultural additional value assessment method according to claim 5, wherein: the cultural additional value calculation equation model in the step 7 is shown as a formula (7),
cultural additional evaluation value = cultural spirit enjoyment primary index weight (' ornamental ' secondary index weight of cultural product ' ornamental ' index emotion value of cultural product + artistic ' secondary index weight of cultural product ' artistic index emotion value of cultural product) +cultural brand-shaping primary index weight (' awareness of cultural brand ' secondary index weight ' awareness of cultural brand + loyalty of cultural brand ' secondary index weight ' loyalty of cultural brand ' index emotion value) + inheritance of cultural essence first index weight (' inheritance of cultural ' secondary index weight ' inheritance of cultural ' index emotion value + transmission of cultural ' secondary index weight ' transmission of cultural ' index emotion value) (7).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110515653.3A CN113298365B (en) | 2021-05-12 | 2021-05-12 | Cultural additional value assessment method based on LSTM |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110515653.3A CN113298365B (en) | 2021-05-12 | 2021-05-12 | Cultural additional value assessment method based on LSTM |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113298365A CN113298365A (en) | 2021-08-24 |
CN113298365B true CN113298365B (en) | 2023-12-01 |
Family
ID=77321530
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110515653.3A Active CN113298365B (en) | 2021-05-12 | 2021-05-12 | Cultural additional value assessment method based on LSTM |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113298365B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010132062A1 (en) * | 2009-05-15 | 2010-11-18 | The Board Of Trustees Of The University Of Illinois | System and methods for sentiment analysis |
CN104699766A (en) * | 2015-02-15 | 2015-06-10 | 浙江理工大学 | Implicit attribute mining method integrating word correlation and context deduction |
KR20150083954A (en) * | 2014-01-10 | 2015-07-21 | 어니컴 주식회사 | System and method for providing platform of cultural content based on social network |
CN106651132A (en) * | 2016-11-17 | 2017-05-10 | 安徽华博胜讯信息科技股份有限公司 | DEA-based public cultural service performance evaluation method |
CN108108433A (en) * | 2017-12-19 | 2018-06-01 | 杭州电子科技大学 | A kind of rule-based and the data network integration sentiment analysis method |
US10431210B1 (en) * | 2018-04-16 | 2019-10-01 | International Business Machines Corporation | Implementing a whole sentence recurrent neural network language model for natural language processing |
CN110502744A (en) * | 2019-07-15 | 2019-11-26 | 同济大学 | A kind of text emotion recognition methods and device for history park evaluation |
KR20210044017A (en) * | 2019-10-14 | 2021-04-22 | 한양대학교 산학협력단 | Product review multidimensional analysis method and apparatus |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254060A1 (en) * | 2011-04-04 | 2012-10-04 | Northwestern University | System, Method, And Computer Readable Medium for Ranking Products And Services Based On User Reviews |
CN111767741B (en) * | 2020-06-30 | 2023-04-07 | 福建农林大学 | Text emotion analysis method based on deep learning and TFIDF algorithm |
-
2021
- 2021-05-12 CN CN202110515653.3A patent/CN113298365B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010132062A1 (en) * | 2009-05-15 | 2010-11-18 | The Board Of Trustees Of The University Of Illinois | System and methods for sentiment analysis |
KR20150083954A (en) * | 2014-01-10 | 2015-07-21 | 어니컴 주식회사 | System and method for providing platform of cultural content based on social network |
CN104699766A (en) * | 2015-02-15 | 2015-06-10 | 浙江理工大学 | Implicit attribute mining method integrating word correlation and context deduction |
CN106651132A (en) * | 2016-11-17 | 2017-05-10 | 安徽华博胜讯信息科技股份有限公司 | DEA-based public cultural service performance evaluation method |
CN108108433A (en) * | 2017-12-19 | 2018-06-01 | 杭州电子科技大学 | A kind of rule-based and the data network integration sentiment analysis method |
US10431210B1 (en) * | 2018-04-16 | 2019-10-01 | International Business Machines Corporation | Implementing a whole sentence recurrent neural network language model for natural language processing |
CN110502744A (en) * | 2019-07-15 | 2019-11-26 | 同济大学 | A kind of text emotion recognition methods and device for history park evaluation |
KR20210044017A (en) * | 2019-10-14 | 2021-04-22 | 한양대학교 산학협력단 | Product review multidimensional analysis method and apparatus |
Non-Patent Citations (7)
Title |
---|
Framework for Sentiment-Driven Evaluation of Customer Satisfaction With Cosmetics Brand;Jaehun Park;《IEEE Access》;第8卷;98526-98538 * |
Hui Song 等.Semantic Analysis and Implicit Target Extraction of Comments from E-Commerce Websites.《2013 Fourth World Congress on Software Engineering》.2014,331-335. * |
Research and Practice of Cultural Heritage Promotion: The Case Study of Value Add Application for Folklore Artifacts;Kuo-An Wang等;《2012 International Symposium on Computer, Consumer and Control》;610-613 * |
吕家欣 等.文旅品牌顾客契合价值测量——基于细粒度情感分析模型.《投资与创业》.2023,第34卷(第01期),162-164. * |
基于情境系统的湖湘文创产品设计评价体系研究;祁飞鹤;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》(第07期);C028-42 * |
大众媒介综合价值评估体系研究;周笑;《东岳论丛》;第30卷(第06期);42-48 * |
孟鹏 等.出版文化品牌价值影响因素及评价指标体系研究.《中国商论》.2019,(第23期),213-216. * |
Also Published As
Publication number | Publication date |
---|---|
CN113298365A (en) | 2021-08-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zheng et al. | Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network | |
CN110008311B (en) | Product information safety risk monitoring method based on semantic analysis | |
CN109492157B (en) | News recommendation method and theme characterization method based on RNN and attention mechanism | |
CN111767741B (en) | Text emotion analysis method based on deep learning and TFIDF algorithm | |
CN112001187B (en) | Emotion classification system based on Chinese syntax and graph convolution neural network | |
CN109933664B (en) | Fine-grained emotion analysis improvement method based on emotion word embedding | |
US9183274B1 (en) | System, methods, and data structure for representing object and properties associations | |
CN107180045B (en) | Method for extracting geographic entity relation contained in internet text | |
CN104636425B (en) | A kind of network individual or colony's Emotion recognition ability prediction and method for visualizing | |
CN111914096A (en) | Public transport passenger satisfaction evaluation method and system based on public opinion knowledge graph | |
CN105843897A (en) | Vertical domain-oriented intelligent question and answer system | |
CN112001186A (en) | Emotion classification method using graph convolution neural network and Chinese syntax | |
CN108038725A (en) | A kind of electric business Customer Satisfaction for Product analysis method based on machine learning | |
CN110442728A (en) | Sentiment dictionary construction method based on word2vec automobile product field | |
CN110750648A (en) | Text emotion classification method based on deep learning and feature fusion | |
Miao et al. | A dynamic financial knowledge graph based on reinforcement learning and transfer learning | |
CN114817454B (en) | NLP knowledge graph construction method combining information quantity and BERT-BiLSTM-CRF | |
CN115757819A (en) | Method and device for acquiring information of quoting legal articles in referee document | |
Li | Research on extraction of useful tourism online reviews based on multimodal feature fusion | |
CN110826315B (en) | Method for identifying timeliness of short text by using neural network system | |
Sajeevan et al. | An enhanced approach for movie review analysis using deep learning techniques | |
CN111951079A (en) | Credit rating method and device based on knowledge graph and electronic equipment | |
CN113704459A (en) | Online text emotion analysis method based on neural network | |
CN107908749B (en) | Character retrieval system and method based on search engine | |
CN112905744A (en) | Qiaoqing question and answer method, device, equipment and storage device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |